Draft Ideas on a Process to Design and Build the DFT Vocabulary Gary Berg-Cross Developed for DFT WG Session at 2 nd RDA Plenary Sept. 2013 Washington.

Slides:



Advertisements
Similar presentations
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
Advertisements

Information Modelling MOLES Metadata Objects for Linking Environmental Sciences S. Ventouras Rutherford Appleton Laboratory.
1 Metadata Registry Standards: A Key to Information Integration Jim Carpenter Bureau of Labor Statistics MIT Seminar June 3, 1999 Previously presented.
CS570 Artificial Intelligence Semantic Web & Ontology 2
Database Systems: Design, Implementation, and Management Tenth Edition
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Entity-Relationship Model Chapter 2.
United Nations Statistics Division Principles and concepts of classifications.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Requirements Engineering n Elicit requirements from customer  Information and control needs, product function and behavior, overall product performance,
Object-Oriented Analysis and Design
System Analysis - Data Modeling
Architectural Design Principles. Outline  Architectural level of design The design of the system in terms of components and connectors and their arrangements.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Developed by Reneta Barneva, SUNY Fredonia Component Level Design.
Sharif University of Technology Session # 7.  Contents  Systems Analysis and Design  Planning the approach  Asking questions and collecting data 
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
Course Instructor: Aisha Azeem
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 10 Structuring.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Future of MDR - ISO/IEC Metadata Registries (MDR) Larry Fitzwater, SC 32 WG 2 Convener Computer Scientist U.S. Environmental Protection Agency May.
Computer System Analysis Chapter 10 Structuring System Requirements: Conceptual Data Modeling Dr. Sana’a Wafa Al-Sayegh 1 st quadmaster University of Palestine.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
CIT UPES | Sept 2013 | Unified Modeling Language - UML.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA 6 th Plenary Paris, Sept. 25, 2015 Gary Berg-Cross, Raphael Ritz Co-Chairs.
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
Discussion of Larger Scope DFT Concepts & Terminological Issues Prepared for RDA P4, Amsterdam, Sept 2014 Gary Berg-Cross: Co-Chair DFT WG.
An Introduction to Design Patterns. Introduction Promote reuse. Use the experiences of software developers. A shared library/lingo used by developers.
School of Computing FACULTY OF ENGINEERING Developing a methodology for building small scale domain ontologies: HISO case study Ilaria Corda PhD student.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Methodology - Conceptual Database Design
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
Taken from Schulze-Kremer Steffen Ontologies - What, why and how? Cartic Ramakrishnan LSDIS lab University of Georgia.
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
OCM Ontology and Ontology Services August 14, 2012 NOAA, Boulder CO Peter Fox (RPI* and WHOI**) and *Tetherless.
RDA Data Foundation and Terminology (DFT) WG: Overview  Prepared for Collab Chairs Meeting, NIST, Nov 13-14, 2014  Gary Berg-Cross, Raphael Ritz, Peter.
RELATORS, ROLES AND DATA… … similarities and differences.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
SOCoP 2013 Workshop: Vision and Strategy Gary Berg-Cross SOCoP Executive Secretary Nov NSF Stafford II facility Wilson Blvd, Ballston VA.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
WIGOS Data model – standards introduction.
The Use of Ontology Design Patterns for Metadata Semantics: Methods, Chances, and Limitations Gary Berg-Cross SOCoP Executive Secretary US RDA Advisory.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 10 Structuring.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Draft Data Foundation and Terminology (DFT) Vocabulary Development Process Prepared for WG-Core meeting 24/25.2 Munich/Garching Gary Berg-Cross Co-Chair.
Data Foundations And Terminology (DFT) IG Virtual Meeting July 6 th 2016 Co-Chairs DFT IG :Gary Berg-Cross & Raphael Ritz P8 Sessions DFT IG Breakout Session.
COP Introduction to Database Structures
Chapter 10 Structuring System Requirements: Conceptual Data Modeling
Business System Development
Lec 6: Practical Database Design Methodology and Use of UML Diagrams
ece 627 intelligent web: ontology and beyond
Fundamentals & Ethics of Information Systems IS 201
Data Foundation and Terminology (DFT) Vocabulary Development Session
MANAGING DATA RESOURCES
Chapter 10 Structuring System Requirements: Conceptual Data Modeling
The Entity-Relationship Model
Stefan SCHULZ IMBI, University Medical Center, Freiburg, Germany
Chapter 10 Structuring System Requirements: Conceptual Data Modeling
Lecture 10 Structuring System Requirements: Conceptual Data Modeling
Presentation transcript:

Draft Ideas on a Process to Design and Build the DFT Vocabulary Gary Berg-Cross Developed for DFT WG Session at 2 nd RDA Plenary Sept Washington DC Terms Synonyms Conceptual Model Concepts

Topical Outline Overview of Phased Plan Basic Ideas –Vocabularies and Concepts Start up, Requirements analysis and development of candidate list Vocabulary Analysis Process Vocabulary Design process Refinement Draft Vocabulary Publication and Review

Basic Ideas Term Definition Verbal designation (3.4.1) of a general concept in a specific subject field. (ISO :2000) NOTE: A term may contain symbols and can have variants, e.g. different forms of spelling. A Controlled Vocabulary (CV) is a consensus, standardized set of terms, including new ones, used to refer to concepts Terms are proxies for concepts within a conceptual system Standards (like ISO) emphasize principles of vocabulary control that guide their design and development. 1. Eliminate (conceptual) ambiguity 1.use principles for defining/describing concepts to which terms are assigned 2.show relations and structure to help understanding 2.Control synonyms - term equivalence –simple formulation as a synonym ring “mg/l” has synonym of “milligrams per liter” - both refer to a concept A “databank” is an obsolete synonym for ”database” (preferred term) 3.Establish relations among terms where appropriate 4.Consider and systematize the role of lexical modifiers (e.g. digital or data object) 5.Employ Guiding Principles… Data Object Digital Object Data Element Data Set digital record

Vocabulary, Concepts and Reality: (after Ogden and Richards) Name Reality Correspondence Understand Reality – Semantics starts with evidence What is the concept? Something we understand. Extant Work, Philosophy, Psych, Data Science Perspectives ….. This data collection is different than that a data set Maybe there is more than 1 type of Aggregation. ? Task- Regiment Language (assertions using controlled terms?) Dataset A dataset is a logically meaningful grouping of similar or related data. Dataset isa grouping Data in a dataset has relation Relation types- source or class of source, processing level, algorithms, topic, time period…….. A dataset isPartOf a dataset series…… A dataset series (according to ISO 19115, & 19114) is a collection of datasets sharing the same product specification… Objects > data Objects Vocabulary - Sets of terms used by groups of cognitive agents to represent & communicate about concepts. In a language syntax can relate symbolic designations/terms

Start up Activities Scope Needs to be practical & focused but useful to overall RDA effort. Implied by scope of current documents & cited work Input from RDA WGs Identify Core concepts and terms Growing list in current documents Gather definitions from sources and interested parties 3-4 examples discussed later at this session Implied concepts that may be needed to bridge differences Gather and discuss ideas/needs/interests from RDA WGs Understand a concept-vocabulary development process Employ Guiding Principles 1.Reuse Don’t re-invent Adapt existing standards, methods and vocabulary that are fit for purpose. Engage test and validate terms in the community relevant WGs and Communities of Practice (CoI/CoP) for analysis, design, production

Many international standards and other specifications ISO 704:2000 – Terminology – Principles and methods ISO :2000 – Terminology – Part 1: General vocabulary ISO/IEC :2003 – Metadata registries – Part 3: Metamodel & basic attributes ISO/IEC :2005 – Metadata registries – Part 6: Registration ISO/IEC 11404:2007 – General purpose datatypes ISO/IEC 19773:2011 – Metadata Modules Open Government Vocabularies – Content Model Open Government Vocabularies Working Group %2Fgld%2Fwiki%2Fimages%2F9%2F96%2FOpen_Government_Vocabualries_-_Content_Model_v02.doc Data Element: A logical, identifiable unit of data that forms the basic organizational component in a database. Usually a combination of characters or bytes referring to one separate piece of information. A data element may combine with one or more other data elements or digital objects to form a digital record.

A Start on Vocabulary Analysis Process Identify concepts and concept relations implied by collected terms; Analyze and model concept systems on the basis of identified concepts and concept relations that are used to understand a term and its referent; Establishing representations of concept systems through concept diagrams; Craft concept-oriented definitions as a concept base; Test arrangement in taxonomical class hierarchy(s) Add essential Properties/Attributes/slots to distinguish related concepts Link concepts via Relations…..etc. Associate a designated vocabulary term to each concept (in one or more languages); and, Document the vocabulary in an agreed upon form, perhaps starting as a structured glossary and support concept models Adapted liberally from ISO TC 37 Standards Basic Principles of Terminology

Vocabulary Design Process & Vocabulary Qualities Both analysis and design may employ conceptual modeling to capture the essential meaning and structure of the descriptions of the vocabulary. The product of this is some form of conceptual model. Desired Qualities 1.Adequate capture of content intuitions, expressed by domain experts, 1. in an understandable forms 2.includes details on constraining descriptions 3.Uses well defined relations, taxonomic and others 4.Illustrate with examples 2.Rigorous – stands up to rational analysis 3.Minimally redundant - no unintended synonyms

9 Vocabulary & Model Artifact Designed by Rigorous Method World Situations Data on a Wetlands Conceptualization starts to model (part of) the data world Expressed in a Communicative form Intended Model Fitting Conceptualization Our Vocabulary & Model Product Adapted liberally from Guarino’s 1998 Formal Ontology in Information Systems Vocabulary commits to certain relations say subtypes Subclass ….. A specific artifact designed to express the intended meaning of a (shared) Vocabulary Data World Situations Act of observing a phenomenon, with goal of producing an estimated property value. -OGC O &M model Observation Interaction

Simple Vocab Entry Example Data Object Type of: Abstract Object Sub-types: Data object, digital object,…… Definition: n computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure. (With the later introduction of object oriented programming the same word, "object", refers to a particular instance of a class) Definition 2: a Data Object is a dataset Equivalent terms (other languages)... Attributes…. Relations a data element isPartof Data Object…. Examples/Instances include: repository metadata, data models, databases, tables, views, files, entities, columns, data elements, and attributes. (Source

Refinement Informal Models DataReality Analysis to Understand Data Concepts Formal Models Natural Language…. Controlled Voc Scientific ” Models” (see above) Semantic Web formalisms Taxonomies. Taxonomies. RDF(S), RDF(S), OWL…. OWL…. Simplify more From Long-term Preservation for Spatial Data Infrastructures: a Metadata Framework and Geo-portal Implementation Community validation in necessary for standardization, but also Gradual Formalization

Vocabulary can be Built out in Stages –for example RDA Scope…. WG Scope Core Starter Set 9 months 12 months 6 months On 3 months

Draft vocabulary Publication and Review Products a reference document about DFT, Including a structured and well documented vocabulary including distinction of Preferred, Admitted, Deprecated, Obsolete terms Register the defined terms in an ISO-like concept registry so that everyone can easily refer to them Create an accompanying abstract data organization/conceptual model that may be also expressed graphically.

14

Backup Slides

Organizing Relations – For Example Kinds of “Structure”

17 Add Relations Incrementally: Richer Schemata & Reusable Patterns Simple Feature-State Model (from GRAIL) becomes a richer schema DO, …. data element…. binary form Every DO is a type of Abstract Object described by metadata, has associated symbolic content may have parts data element

18 The semantic problem, "what is a data aggregation?" In statistics, aggregate data describes data combined from several measurements. When data are aggregated, groups of observations are replaced with summary statistics based on those observations like an average. In economics, aggregate data or data aggregates describes high-level data that is composed from a multitude or combination of other more individual data. But it could just be some type of merging of data and not integrated. One term - three concepts What does “Understanding” such things involve? –Terms & concepts (recognition, disambiguation via MEANING) –Conceptual structures (e.g., taxonomies, semantic relations) –Logical Inferences (e.g., generalize from broad concepts or specialize)

Terminological Services Common in medical realm

Controlled English tools

21 Understanding Why Data Is Structured as It Is Physical Object admin event Event Participates in Type Has Location Type Place Identified by Location code Object Entities are Related Meaningfully And “attributes” are Related Entities. Place Location code…. Place - Repository event Repository event Intersection Is an artifact to link tables tuples Have this static part Need this A deeper understanding of why the data values are what they are Processes …….