Presentation is loading. Please wait.

Presentation is loading. Please wait.

TEI, CIDOC-CRM and a Possible Interface between the Two? Øyvind Eide & Christian-Emil Ore Unit for Digital Documentation, University of Oslo, Norway.

Similar presentations


Presentation on theme: "TEI, CIDOC-CRM and a Possible Interface between the Two? Øyvind Eide & Christian-Emil Ore Unit for Digital Documentation, University of Oslo, Norway."— Presentation transcript:

1 TEI, CIDOC-CRM and a Possible Interface between the Two? Øyvind Eide & Christian-Emil Ore Unit for Digital Documentation, University of Oslo, Norway

2 The CIDOC Conceptual Reference Model (cidoc.ics.forth.gr) What is the CIDOC CRM? –An object oriented ontology developed by ICOM-CIDOC, 1996-2005 –Accepted as ISO-21127 in June 2005 –About 80 classes and 130 properties for cultural and natural history –CRM instances can be encoded in many forms: RDBMS, ooDBMS, XML, RDF(S), OWL. What is the CIDOC CRM for? –Intellectual guide to create schemata, formats, profiles Extension of CRM with a categorical level, e.g. reoccurring events –Best practice guide –A language for analysis of existing sources and models for data integration (mapping) –Transportation format for data integration / migration /Internet Ongoing activities –CRM-Core –Harmonisation with object oriented version of FRBR, (Functional Requirement for Bibliographic Records, IFLA), first version will be published in fall 2006 –Extension of CRM with a categorical level, e.g. reoccurring events

3 The CIDOC CRM Top-level Classes relevant for Integration participate in E39 Actors (persons, inst.) E55 Types E28 Conceptual Objects E18 Physical Things E2 Temporal Entities (Events) E41 Appellations refer to / refine refer to / identifie have location within E53 Places E52 Time-Spans at affect or refer to

4 CIDOC CRM: Class hierarchy

5 CIDOC CRM: Events

6 CIDOC CRM: Things and Conceptual object

7 Original text (text witness) Bibliographical record Text with XML mark-up 1. Structural mark-up (2. Lemmatization etc.) Step 1: registration Step 3: transcription Facsimile Step 2: reproduction Text with XML mark-up Information elements identified and marked up according to a simple information model, DTD) Step 4: content mark-up Museum database artefacts, excavations, referential information Event/object oriented model (CIDOC-CRM compatible) Motivation: Grey literature in Museums

8 Catalogue entry 8. Malayan dagger, taken from pirates of the Indian Oceans. Beautiful handle, graven as a human figure above waistline. Snake winded blade. VII, IX, p, 2. Daa,O., 99. Donated April 11 1856 from Captain Teiste. Motivation: Grey literature in Museums

9 Catalogue entry with mark up 8. Malayan dagger, taken from pirates of the Indian Oceans. Beautiful handle, graven as a human figure above waistline. Snake winded blade. VII, IX, p, 2. Daa,O., 99. Donated April 11 1856 from Captain Teiste. Motivation: Grey literature in Museums

10 The excavation in Wasteland in 2005 was performed by Dr. Diggey. He had the misfortune of breaking the beautiful sword (C50435) into 30 pieces. Motivation: Grey literature in Museums

11 E31 Document E21 Person (actor) E82 Actor appellaton ”Dr. Diggey” E7 Activity E52 Time span E50 Date ”2005” E55 Type ”Archaeological report” P2 has type P1 is identified by E11 Modification ”Breaking of the sword” P9 forms part of P14 carried out by E22 Man–Made object “Sword” P12 was present at P70 documents P4 has time-span E55 Type ”Archaeological excavation” E53 Place E44 Place appellaton ”Wasteland” P7 took place at E82 Object identifier ” C50435” P2 has type The content of the text expressed in CIDOC-CRM P1 is identified by P78 is identified byP87 is identified by

12 Originally, a research project within the humanities –Founded in 1987-88 –Sponsored by three professional associations –Funded 1990-1994 by US NEH, EU LE Programme etal Major influences –digital libraries and text collections –language corpora –scholarly datasets International consortium established June 1999 (see http://www.tei-c.org/) TEI - where did itcome from? Acc. to L. Burnard

13 better interchange and integration of scholarly data support for all texts, in all languages, from all periods guidance for the perplexed: what to encode — hence, a user-driven codification of existing best practice assistance for the specialist: how to encode — hence, a loose framework into which unpredictable extensions can be fitted These apparently incompatible goals result in a highly flexible, modular, environment Goals of the TEI Acc. to L. Burnard

14 A set of recommendations for text encoding, covering both generic text structures and some highly specific areas based on (but not limited by) existing practice A very large collection of element (400+) definitions with associated declarations for various schema languages a modular system for creating personalized schemas or DTDs from the foregoing for the full picture see http://www.tei- c.org/TEI/Guidelines/ TEI Deliverables Acc. to L. Burnard

15 a way of looking at what ‘text’ really is a codification of current scholarly practice (crucially) a set of shared assumptions about the digital agenda: –focus on content and function (rather than presentation) –identify generic solutions (rather than application- specific ones) Legacy of the TEI Acc. to L. Burnard

16 Elements for detailed bibliographic description: –File description Title statement Edition statement Extent statement Publication statement Series statement Notes Source Description – bibliographic elements (Manuscript description) –Encoding description –Profile description –Revision description Mapping to other meta data standards –Marc, discusset –Dublin Core unfinished TEI - the header

17 Base Tag Set for Verse Performance Texts Transcription of Speech Print Dictionaries Manuscript description Linking and alignment; analysis Feature structures; Certainty; physical transcription; textual criticism, Names and dates Graphs, networks and trees Graphics, figures and tables Language Corpora Representation of non-standard characters and glyphs Feature System Declaration TEI additional element sets

18 Some “ontological” elements in TEI: Events History –groups elements describing the full history of a manuscript or manuscript part. Origin –contains any descriptive or other information concerning the origin of a manuscript or manuscript part CustEvent –describes a single event during the custodial history of a manuscript Provenance –contains any descriptive or other information concerning the origin of a manuscript or manuscript part Acquisition –contains any descriptive or other information concerning the process by which a manuscript or manuscript part entered the holding institution.

19 Event –(Event) any phenomenon or occurrence, not necessarily vocalized or communicative, for example incidental noises or other events affecting communication. Eg. “ceiling collapses” during a recorded interview persEvent –contains a description of a particular event of significance in the life of a person Birth,death –contains information about a person's birth/death, such as its date and place Date –contains a date in any format. Occasion –a temporal expression (either a date or a time) given in terms of a named occasion such as a holiday, a named time of day, or some notable event Some “ontological” elements in TEI: Events, time appellations

20 Person –provides information about an identifiable individual, for example a participant in a language interaction, or a person referred to in a historical source. Hand –used in the header to define each distinct scribe or handwriting style. Author –in a bibliographic reference, contains the name of the author(s), personal or corporate, of a work; the primary statement of responsibility for any bibliographic item Name –(name, proper noun) contains a proper noun or noun phrase Some “ontological” elements in TEI: Actors and appellations

21 Ovid Publius Ovidius Naso 20 March 43 BC Sulmona Italy 17 or 18 AD Tomis (Constanta) Romania Some “ontological” elements in TEI: Person example (from P5 guidelines)

22 A simple extension of the TEI-dtd The root CIDOC-CRM element The class element <!ATTLIST crmClass id#ID className#CDATA> The property element <!ELEMENT crmProperty #EMPTY <!ATTLIST crmProperty id#ID propName#CDATA from#IDREF to#IDREF>

23 The excavation in Wasteland in 2005 was performed by Dr. Diggey. He had the misfortune of breaking the beautiful sword (C50435) into 30 pieces. The sample text revisited

24 The text expressed with a TEI mark-up The excavation in Wasteland in 2005 was performed by Dr. Diggey. He had the misfortune of breaking the beautiful sword (C50435) into 30 pieces.

25 archaeological excavation Dr. Diggey 2005 … … … Encoding the information in an RDF-triplet fashion

26 CRM-Core – a dtd for encoding information [suggested by CRM-SIG]

27 E31 Document Archaeological report Wasteland excavation 2005 report P70_documents Wasteland_2005_excavation E7_Activity Dr. Diggey excavator C50435 sword 2005 Wasteland P70_documents damage_to_artifact_C50435 E11_Modification Dr. Diggey excavator C50435 sword P9_forms_part_of Wasteland_2005_excavation Encoding the information in CRM Core (Factoides)

28 E21 Person archaeologist Dr. Diggey P14 carried out by damage_to_artifact_C50435 E11 Modification excavator C50435 sword E82 Actor appellaton formal name mention of name Wasteland_excavation_2005_report#n2 Encoding the information in CRM Core (Factoides)

29 Conclusions and further work Possible now –TEI extended with a RDF-like CIDOC-CRM –TEI extended with CRM-Core records Future: –Make a mapping from TEI-elements to CRM –Make a mapping from the TEI-header into ooFRBR –Create an extension of the TEI definition –Write guidelines for CIDOC-CRM encoding of information in TEI documents –Convince the TEI users


Download ppt "TEI, CIDOC-CRM and a Possible Interface between the Two? Øyvind Eide & Christian-Emil Ore Unit for Digital Documentation, University of Oslo, Norway."

Similar presentations


Ads by Google