Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

CNES implementation of the ISO standard An extension of the current CNES implementation of the ISO metadata standard.
DDI for the Uninitiated ACCOLEDS /DLI Training: December 2003 Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta.
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
Metadata Management at GESIS-ZA Reiner Mauer GESIS – Data Archive and Data Analysis CESSDA-Expert Seminar Odense, September 11th 2008.
Setting Up Information Portal Irwan Sampurna C-CONTENT 23 May 2006.
ETD: Metadata Standards Hussein Suleman University of Cape Town Department of Computer Science Digital Libraries Laboratory September.
OCLC Online Computer Library Center CONTENTdm Developers Meeting ALA Midwinter Meeting Seattle, WA January 19 th, 2007 Claire Cocco, Product Manager Joe.
Data modeling at Europeana Antoine Isaac METS Workshop at the Digital Libraries 2014 Conference London, Sept. 11, 2014.
1 Uppsala University Library Eva Müller Peter Hansson Stefan Andersson Uwe Klosa Electronic Publishing Centre Krister Östlund Waller project.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
Metadata for Heterogeneous Digital Assets Fellow: Yong-Mi Kim Faculty Mentors: Judy Ahronheim and Lynn Johnson.
Ingest and Loading DigiTool Version 3.0. Ingest and Loading 2 Ingest Agenda Ingest Overview and Introduction Ingest activity steps Transformers Task Chains.
MINT – METADATA INTEROPERABILITY SERVICES Nikolaos Simou – National Technical University of Athens.
Making Metadata Work for the NSDL. Starting from Sept with...  A prototype with not much behind it that was re-usable (
Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.
Europeana: Europe's Digital Library, Museum and Archive Ashley Carter and Dana Sagona.
SDN2 First Training Course, Oostende IODE-PO, 2-6 July 2012 Metadata Directories Management Sissy Iona, HCMR/HNODC.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
(C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program.
Collections Management Museums EMu 3.1 / 3.2 – New Features EMu 3.1 / 3.2 New Features Bernard Marshall Chief Technology Officer KE Software.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
A summary of the report written by W. Alink, R.A.F. Bhoedjang, P.A. Boncz, and A.P. de Vries.
Classroom User Training June 29, 2005 Presented by:
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
Europeana: Update on Metadata Mapping and Normalisation, Content Ingestion and Aggregation Activities Robina Clayphan Interoperability Manager, EDLF ECDL.
1 Pan-European metadata for cultural content ePSIplus Thematic Meeting Madrid, 12 September 2008 Rob Davies.
1 CS 430: Information Discovery Lecture 14 Automatic Extraction of Metadata.
Lucas Mak and Dao Rong Gong Michigan State University Millennium and XML: Repurposing and Customizing Metadata May , 2009.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Metadata, the CARARE Aggregation service and 3D ICONS Kate Fernie, MDR Partners, UK.
Europeana as a Linked Open Data case (in progress) Antoine Isaac ISKO UK Seminar “Making Metadata Work” London, June 23, 2014.
CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.
The Active Role of Libraries in Web Based Education Patras Greece April 11th 2003.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition Tools and Resources to Assess and Enhance Fitness-For-Use.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
Aligning library-domain metadata with the Europeana Data Model Sally CHAMBERS Valentine CHARLES ELAG 2011, Prague.
The mapping process – some observations Robina Clayphan EDLF.
Congratulations Public Library of Veria on winning the Bill and Melinda Gates Foundation Access to Learning Award 2010!! Congratulations Public Library.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Searching Business Data with MOSS 2007 Enterprise Search Presenter: Corey Roth Enterprise Consultant Stonebridge Blog:
[D2.5] Object model and metadata: Open issues Workgroups Kick-off meeting – 2 & 3 April 2009 Julie Verleyen.
Antoine Isaac 1 st PRELIDA Workshop Pisa, June 26, 2013.
© Copyright 2013 STI INNSBRUCK “How to put an annotation in HTML?” Ioannis Stavrakantonakis.
An OAI-Compliant Federated Physics Digital Library for the NSDL Department of Computer Science Old Dominion University, Norfolk, VA In Collaboration.
Collection Level Descriptions in the Revelation project Project Manager, Marie-Pierre Detraz, Project Officers, Linda Needham and Beth Galer.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Resource Description and Access (RDA) information session Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee.
Registering Earth Science Data and Data Related Services Using NASA’s Global Change Master Directory (GCMD) Tyler Stevens (GIS/Services Coordinator) ESIP.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Basic Metadata Workshop Claire Hill Project Coordinator, AVEL An introduction to metadata and its application.
Enter your user name Enter your password Log in Forgot your password? Sign up! UploadMapPreviewPublishLog inConfigure Log in&authorization.
ESRI Education User Conference – July 6-8, 2001 ESRI Education User Conference – July 6-8, 2001 Introducing ArcCatalog: Tools for Metadata and Data Management.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
CombeDay Making Data Openly Available Simon Coles.
Chapter – 8 Software Tools.
Santi Thompson - Metadata Coordinator Annie Wu - Head, Metadata and Bibliographic Services 2013 TCDL Conference Austin, TX.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
MICHAEL Culture Association WP4 Integration of existing data structure into Europeana ATHENA, WP4 Working group technical meeting Konstanz, 7th of May.
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
TRIG: Truckee River Info Gateway Dave Waetjen Graduate Student in Geography Information Center for the Environement (ICE) University of California, Davis.
A step-by-step guide to DOI registration
MIKADO: Generation of CDI ISO19139 XML files
Part of the Multilingual Web-LT Program
Márton Németh – László Drótos How to catalogue a web archive?
Use Cases Simple Machine Translation (using Rainbow)
Metadata supported full-text search in a web archive
Presentation transcript:

Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop

A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

CONTENT SURVEY #0

Stage #0: Content survey Input: Output: Specifications of content contribution Excel specs questionnaire

CONTENT SURVEY #0

Stage #1: Harvesting and package creation Input: Output:Harvested data in XML Collection-specific analysis tool Sample of source data: 1000 records Mapping specifications template Excel specs XML raw data HTML analysis tool XML sample raw data TXT mapping template

CONTENT SURVEY #0

#2 Analysis and mapping specifications Input: Output: Excel specs TXT mapping specs HTML analysis tool XML sample raw data TXT mapping template

CONTENT SURVEY #0

Stage #3: Mapping and normalisation Input: Output: XML raw data TXT mapping specs XML normalised mapped data XML profile Quality check

NORMALISER

STAGE 3

CONTENT SURVEY #0

Stage #4: Database storage and indexing Input: Output: XML normalised mapped data DBINDEX

A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

Europeana Semantic Element (ESE) Europeana “Schema” for the Prototype Based on Dublin Core Metadata Elements Set (DCMES)(ISO ) 49 Elements (26 Elements & 23 Refinements) Created through discussions in July/August 2008

ESE specialities europeana:country europeana:provider (dc:source) europeana:language (dc:language) europeana:type (dc:type, dc:format) europeana:year (dc:date) europeana:isShownBy (dc:relation) europeana:isShownAt (dc:relation) europeana:object europeana:uri (dc:identifier)

All normalised:  Syntax  Value Let’s examine their characteristics ESE specialities

Definition:  Country of content provider. If several countries: Europe Format:  String, ex: switzerland, germany,… Reference:  TEL controlled list. Supports TEL interface translation mechanism Mechanism:  Manual In portal:  Facet browsing of search results Normalised ESE terms: Country

Definition:  Organisation sending the data to Europeana Format:  String, ex: Musées lausannois, Nasjonalbiblioteket,… Reference:  Europeana controlled list of content providers: Mechanism:  Manual but potentially can be automated In portal:  Facet browsing of search results Normalised ESE terms : Provider

Definition:  Language of provider’s country (ESE:languages of the metadata) Format:  2-letters, ex: it, no,fr, en, es,… Reference:  ISO639-1 language codes  Exception: If several languages: “mul” Mechanism:  Manual but potentially can be automated In portal:  Facet browsing of search results Normalised ESE terms: Language

Definition:  Type of the original object Format:  String Reference:  4 Europeana types: IMAGE, TEXT, SOUND, VIDEO Mechanism:  Manual: Mapping specified by content provider In portal:  Categorisation display  Facet browsing of search results Normalised ESE terms: Type

Definition:  Date of creation of the original object (analog or born digital) Format:  4 digits [YYYY], ex: 1950 Reference:  Europeana year Mechanism:  Automatic extraction with “YearExtractor” converter In portal:  Facet browsing of search results  Browsing by time (timeline) Normalised ESE terms: Year

Definition:  URL to the digital object Format:  URL ( Mechanism:  Automatic or manual In portal:  Linking Normalised ESE terms: isShownBy

Definition:  URL to the digital object with context Format:  URL ( Mechanism:  Automatic or manual In portal:  Linking Normalised ESE terms: isShownAt

Definition:  URL to the digital object as thumbnail Format:  URL ( Mechanism:  Automatic or manual In portal:  Display Normalised ESE terms: Object

Definition:  Record identifier for Europeana system Format:  URI Mechanism:  Automatic: special algorithm guaranteeing uniqueness (and integrity) of records In portal:  MyEuropeana  Full digital object view in Europeana Normalised ESE terms: URI

A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

Metadata normalisation in practice Demo of stage #3’s workflow: 1.Go through data of example collection #1 2.Practical exercise: let’s normalise example collection #2 for Europeana!! 3.2 examplesof known issues MAPPING & NORMALISATION #3

SUBVERSION (SVN)

COLLECTION FOLDERSOURCE XMLMAPPING SPECS TXTOUTPUT XMLMAPPING/NORM. SPECS XML

Example 1: “Midas” collection 83 moving image records from the Association des Cinémathèques Européennes  Harvested data  Fields mapping/Type values mapping specs  Analysis file (source data)  Mapping file  Profile file  Analysis file + sample (normalised data)

Example 2: “Outsider Art Museum” collection 4142 records from the Musées Lausannois

Known issues with mapping/profile files 1. Wrong syntax in mapping file causes errors in profile.xml:  If use “=>” in comment in mapping.txt this creates a mapping entry in profile.xml! Ex: ………

BEFORE

AFTER

Known issues with mapping/profile files 2. Wrong syntax in mapping file causes errors in profile.xml:  There should be 2 blanks between “=>” and “N/A” and not one otherwise the mapping specification is not well formatted in XML in profile.xml: Ex: ………………….

MAPPING.TXT PROFILE.XML MAPPING.TXT PROFILE.XML profile.xml with error: 2 white spaces!

Documentation in Europeana context Europeana Semantic Elements (ESE) v3.1 “Europeana – Data Offline Preparation” Commented version of “profile.xml” “Quality Control Checklist”

A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

Questions about Europeana metadata ingestion/normalisation process? Integration and/or compatibility of this process with EuropeanaLocal content strategy:  Where normalisation will take place?  By who? … Discussion

Thank you

 Duplicated records  Records without URLs to digital object  Records without Europeana type (SOUND, TYPE, IMAGE, VIDEO)  Records to copyright-protected digital objects Discarding factors during normalisation