Semantic Annotations in the Archaeological Domain Andreas Vlachidis, Ceri Binding, Keith May, Douglas TudhopeSTAR STAR Semantic Technologies for Archaeological.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
DCMI Workshop on Metadata and Search Vendor Panel Presentation Bradley P. Allen
DELOS WP5 Workshop: Semantic Interoperability in DL systems, 17 th September 2004, Bath, UK Semantic Interoperability in Digital Library Systems Task 3:
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
A Prototype Implementation of a Framework for Organising Virtual Exhibitions over the Web Ali Elbekai, Nick Rossiter School of Computing, Engineering and.
Archaeology and Terminology Ceri Binding Hypermedia Research Unit, University of Glamorgan, Wales, UK
Towards KOS-based semantic services Doug Tudhope Hypermedia Research Unit University of Glamorgan Helsinki, November 30, 2007.
TEI, CIDOC-CRM and a Possible Interface between the Two Øyvind Eide & Christian-Emil Ore* Unit for Digital Documentation, University of Oslo, Norway (*ICOM.
STELLAR Introduction Ceri Binding, Douglas Tudhope Hypermedia Research Unit, University of Glamorgan.
© University of South Wales Classical Art Semantics Information Extraction: CASIE Pilot Project Dr. Andreas Vlachidis Hypermedia Research Unit University.
STELLAR Introduction Douglas Tudhope Hypermedia Research Unit, University of Glamorgan.
Ontology Notes are from:
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
A Registry for controlled vocabularies at the Library of Congress
Semantic Mediation & OWS 8 Glenn Guempel
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
A Really Brief Crash Course in Semantic Web Technologies Rocky Dunlap Spencer Rugaber Georgia Tech.
Digging Up Data: The Archaeotools project, Faceted Classification and Natural Language Processing in an archaeological context. Stuart Jeffrey, Julian.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
Idea-garden.org SOCIAL SEMANTIC INFORMATION SPACE An Interactive Learning Environment Fostering Creativity Grant agreement no: nd CIDOC CRM-SIG.
KOS-based tools for archaeological dataset interoperability: NKOS Workshop, ECDL 2010 C. Binding, K. May 1, D. Tudhope, A. Vlachidis Hypermedia Research.
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
A J Miles Rutherford Appleton Laboratory SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web NKOS workshop ECDL.
Survey of Semantic Annotation Platforms
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Aude Dufresne and Mohamed Rouatbi University of Montreal LICEF – CIRTA – MATI CANADA Learning Object Repositories Network (CRSNG) Ontologies, Applications.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Semantic Annotation of Grey Literature from an Archaeological Digital Library Andreas Vlachidis, Doug Tudhope Hypermedia Research Unit University of Glamorgan.
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Building a Topic Map Repository Xia Lin Drexel University Philadelphia, PA Jian Qin Syracuse University Syracuse, NY * Presented at Knowledge Technologies.
The Archaeotools project, faceted classification and natural language processing in an archaeological context. University of York, April 2008.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Dictionary based interchanges for iSURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains David Webber.
THE BIBFRAME EDITOR AND THE LC PILOT Module 3 – Unit 1 The Semantic Web and Linked Data : a Recap of the Key Concepts Library of Congress BIBFRAME Pilot.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
© Copyright 2015 STI INNSBRUCK PlanetData D2.7 Recommendations for contextual data publishing Ioan Toma.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
STAR, STELLAR and SKOS Ceri Binding, Phil Carlisle, Keith May, Doug Tudhope, Andreas Vlachidis University of Glamorgan and English Heritage.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
ARIADNE is funded by the European Commission's Seventh Framework Programme Interoperability Holly Wright.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
The Semantic Web By: Maulik Parikh.
Middleware independent Information Service
2. An overview of SDMX (What is SDMX? Part I)
LOD reference architecture
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
C. Binding, K. May1, R. Souza, D. Tudhope, A. Vlachidis
Breaking Down Barriers to Interoperability
Semantic Interoperability in Digital Library Systems
Presentation transcript:

Semantic Annotations in the Archaeological Domain Andreas Vlachidis, Ceri Binding, Keith May, Douglas TudhopeSTAR STAR Semantic Technologies for Archaeological Resources

STAR -STAR STAR - Semantic Technologies for Archaeological Resources About This Presentation  The STAR project  Aims and Objectives  Architecture of Semantic Access to Disparate data sets  Adapted Conceptual Models and Knowledge Resources  Progress to date and available Web services  Semantic Annotations Pathway  The aim of the Research  OBIE for rich, semantic indexing  Domain Specific Requirements  Excavating Grey Literature Documents  General Architecture for Text Engineering (GATE)  Rule Based Pattern Matching Approaches  ‘Gold Standard’ Pilot Evaluation  Adaptation Issues and Conclusions  Ontological Model Verbosity  Prototype Query Builder  Prototype Indexing Deployment Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources The STAR Project  3 year AHRC funded project  Started January 2007, finish December 2009  Collaborators  English Heritage  RSLIS, Denmark  Aims  To investigate the potential of semantic terminology tools for widening access to digital archaeology resources, including disparate datasets and associated grey literature  To demonstrate cross search and browsing at detailed, meaningful level Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources STAR - General Architecture RRAD RPRE RDF Based Common Ontology Data Layer (CRM / CRMEH / SKOS) Grey literature Grey literature EH thesauri, glossaries LEAP STAN IADB Data Mapping / Normalisation Conversion Indexing Web Services, SQL, SPARQL Applications – Server Side, Rich Client, Browser Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Conceptual Models and Knowledge Resources  CRM [ ]  CIDOC Conceptual Reference Model  International standard ISO 21127:2006  CRMEH [ ]  English Heritage Ontological Model  Extends CIDOC CRM for archaeological domain  SKOS [ ]  Simple Knowledge Organization System  RDF representation of thesauri, glossaries, taxonomies, classification schemes etc. Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources CIDOC Conceptual Reference Model  “The CIDOC CRM is intended to promote a shared understanding of cultural heritage information by providing a common and extensible semantic framework that any cultural heritage information can be mapped to” [ ]  About 80 classes and 130 properties for cultural and natural history  Intellectual guide to create schemata, formats, profiles Extension of CRM with a categorical level, e.g. reoccurring events  Best practice guide for data integration (mapping) Transportation format for data integration / migration /Internet Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources CIDOC Conceptual Reference Model participate in have location within at refer to E1 CRM Entity E41 AppellationsE55 Types E2 Temp. Entities (Events) E52 Time-Spans E39 Actors (persons, inst.) E19 Physical Objects E53 Places refer to / identify refer to / refine Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources CRMEH- English Heritage Ontological Model  Adopting and extending CRM for complete picture of on-site and off-site processes.  Entities and relationships relating to Stratigraphic relations and phasing information, finds recording and environmental sampling.  The extended CRM model CRM-EH, comprises 125 extension sub-classes and 4 extension sub-properties.  Multiple disconnected databases and legacy data: CRM as ‘semantic glue’ to pull the data together Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources CRMEH A Closer Look EH_E0007.Context E62.String EH_E0061.ContextUID EH_E0005.Group EH_E0022.ContextDepiction EH_E0008.ContextStuff E54.Dimension E60.Number E55.Type E58.MeasurementUnit P3.has_note P3.1.has_type E55.Type P87.is_identified_by (identifies) P87.is_identified_by (identifies) P89.falls_within P87.is_identified_by (identifies) P87.is_identified_by (identifies) EH_P3.occupied P43.has_dimension P90.has_value P91.has_unit P2.has_type Description, Interpretive comments, Post-ex comments Length, width, height, diameter etc. Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Simple Knowledge Organisation System  Standard set for representation  Thesauri, Taxonomies, Classification Schemes  Publication of controlled structured vocabularies  Intended for the Semantic Web  Built upon standard RDF(S)/XML W3C technologies  Looser semantics than e.g. OWL

STAR -STAR STAR - Semantic Technologies for Archaeological Resources English Heritage Thesauri  Monument types thesaurus  Classification of monument type records  Evidence thesaurus  Archaeological evidence  MDA object types thesaurus  Archaeological objects  Building materials thesaurus  Construction materials  Archaeological sciences thesaurus  Sampling and processing methods and materials  Timelines thesaurus  Periods, and time-based entities Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Data Mapping and Extraction  Extraction of data to RDF triples  5 archaeological datasets  Custom data extraction application  Conversion of controlled terminology  7 thesauri converted to SKOS  27 glossaries created in SKOS  Created based on recording manuals  MultiTes XSL transformation to SKOS Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Applications and Utilities  Data Mapping and Extraction Utility  Bespoke mapping/extraction utility  Extract archaeological data conforming to mapping  Semi-automated manner  Prototype CRM Browser  Prototype CRM browser  Query entry of free-text search terms  Option to navigate the results of returned queries. Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources STAR Data Mapping and Extraction Utility  Entry boxes corresponding to Entity-Relationship- Entity elements of the CRM-EH statement.  SQL query building up: SQL query incorporating selectable consistent URIs (CRM, CRM-EH, SKOS, Dublin Core and others).  Query execution against the selected database  Tabular data export to RDF format file Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Prototype CRM Browser  Test and demonstrate interoperability between datasets.  Incorporated the SKOS based thesauri browsing interface  Distinguish between results, colour coding  Search for “Nauheim Brooch”, Browse results and ‘drill’ deeper  Link to live data, via returned URL hyperlinks Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Semantic Annotations Pathway  Semantic Annotations  specific metadata generation and usage schema  aimed to automate identification of concepts and their relationships in documents  Research effort  Directed towards the generation of rich document indices carrying semantic and interoperable properties for the purposes of semantic interoperability. Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Ontology Based Information Extraction  Advance Information Retrieval  Beyond the limitations of words to the level of concepts  Aid Information Retrieval  To make inferences from heterogeneous data sources  Information Extraction  A specific text analysis task aimed to extract specific information snippets from documents  Ontologies to drive/inform IE  To describe the conceptual arrangements of semantic annotations.  Ontologies; a mediator technology between concepts and their worded representations Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Archaeology Domain & Upper Level Ontologies  Thompson Reuters - Gnosis Plug-in  Limitations of Upper level and Lightweight Ontologies in specialised domains  e.g. Archaeology Grey Literature Document Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Excavating Grey Literature Documents  Raunds reports  Online AccesS to the Index of archaeological excavationS (OASIS) [  Library of unpublished fieldwork reports  English Heritage listed Buildings System (LBS)  Semantic Indexing  Interoperable technologies W3C standards  XML, RDF representation  TEI adoption  Grey Literature; source materials that can not be found through the conventional means of publication Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Information Extraction Framework General Architecture for Text Engineering XML structures to represent semantic properties EH Thesaurus - Object Types -Archaeological Periods Java Pattern Engine ADS – OASIS Grey Literature Ontology -CIDOC CRM-EH Gazetteer Lists Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources GATE Mapping of Knowledge Resources CIDOC E53. Place EH E0007 Context EH E0005 Group CRM-EH Natural Language “Layer” “22 Pits” EH Thesauri Glossaries Onto-Gazetteer Utility Gazetteer Lists Reference to SKOS mapped to the MinorType attribute of list entries Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources JAPE Pattern Matching Rules “Ditch containing prehistoric pottery dating to the Late Bronze Age or Early Iron Age along with burnt flints and flint flakes” E53 PlaceE49Time AppellationE19 Physical Object Pattern Matching Rules expanded beyond simple gazetteer look-up Natural Language – Gazetteer Look-up E49 E49 “Late Bronze Age or Early Iron Age” E49 E19 “prehistoric pottery” E53 ( / ) “Ditch containing prehistoric pottery” Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources A Cascading Extraction Process  A cascading order of natural language processes over text  Expanding from simple gazetteer Look-Up matching rules to complex JAPE transducers  Build up from previously defined annotations to express annotation structures (templates) of ontological concepts Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Annotation Types exposed in XML Annotation Types Ditch containing prehistoric pottery (“Ditch containing prehistoric pottery”) Semantic Attributes for Annotation Types <PhysicalObject gateId="8749" SKOS-EH="134718“ thesaurus =“EH-Object Types" class="EHE0009.ContextFind" ontology=" /CIDOC_v4.2_extensions_eh_.rdf"} XML Annotation Structures DOM – XML Applications Andronikos* Uses PHP-MySQL to display semantic indices values in HTML format Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources ‘Gold Standard’ PILOT Evaluation AVCBDTKMTOTALTOTAL-ALL Precision Recall fMeasure :  ‘Gold standard‘; a collective effort of human annotators  Manual annotation of GS with respect to the Annotation Types (aimed to suggest expansion)  Pilot study (formative assessment).  Aimed to benchmark the performance of the extraction mechanism  Inter-Annotators Scores Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Pilot Evaluation Results - Discussion  Encouraging Recall and Precision rates over 70% for Time Appellation concepts  The limited amount of glossary terms (Places) has influenced the performance  Agreement for Place and Physical Objects was not always clear cut (i.e ‘burnt tree throws’)  The potential of the method to extract complex phrases associated to two or more ontological entities  Future work  Incorporation of additional Ontological Entities (Material, Samples)  Gazetteer enhancement  Pattern matching rules expansion  Formal evaluation of the Extraction method and overall retrieval performance Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Model Adaptation Issues ContextFind DepositionEvent E9.Move EH_E1004 Context E53.Place EH_E0007 ContextFind E19.Physical Object EH_E0009 P26.Moved to P25.Moved P108. Produced by ContextFind ProductionEvent E12.Production Event EH_E1002 P108. has timespan ContextFind ProductionEvent Timespan E52.Timespan EH_E0038 ContextFind ProductionEvent TimespanAppellation E49.TimeAppelation EH_E0039 P1.Identified By RDF Ditch containing prehistoric pottery  CRM-EH is a detailed event driven model. Natural Language can be abstract. Mapping with entities/properties can by-pass model verbosity Interoperable Indices Formats Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Prototype Query Builder  Inter-relationships of the CRM-EH modeled data.  Short-cuts for traversing the commonly followed relationships between key entities  Archaeological Context associated key relationships:  Find  Sample  Stratigraphic, Spatial, Temporal  Group Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR -STAR STAR - Semantic Technologies for Archaeological Resources Prototype Indices Deployment  Andronikos web- portal development  Utilise semantic annotation XML files  The server side technology PHP DOM XML  MySQL database server to store relevant thesauri structures. Conclusions Introduction  STAR  Semantic Annotations  Excavating Grey Lit.  Conclusions

STAR STAR Semantic Technologies for Archaeological Resources