DDI – Metadata for social science data Wolfgang Zenk-Möltgen GESIS – Leibniz Institute for the Social Sciences DataCite.

Slides:



Advertisements
Similar presentations
Workshop on Metadata Standards and Best Practices November th, 2007 Session 4 The Data Documentation Initiative Technical Overview Pascal Heus Open.
Advertisements

3rd International Digital Curation Conference Washington, DC, Dec 2007 Paper Presentations: Interoperability, Metadata & Standards Data Documentation Initiative:
The SDMX Registry Model April 2, 2009 Arofan Gregory Open Data Foundation.
Status on the Mapping of Metadata Standards
ODaF Europe 2009 Virtual Research and Collaborative Center Pascal Heus, Open Data Foundation Tim Mulcahy, National Opinion Research Center
DDI 101 Presented to the : Ontario DLI training session Queens Kingston, Ontario Presented to the : Ontario DLI training session Queens Kingston, Ontario.
Metadata Management at GESIS-ZA Reiner Mauer GESIS – Data Archive and Data Analysis CESSDA-Expert Seminar Odense, September 11th 2008.
DDI Specification: Current Status and Outlook Wendy Thomas Arofan Gregory NADDI 2013.
Foundational Objects. Areas of coverage Technical objects Foundational objects Lessons learned from review of Use Case content Simple Study Simple Questionnaire.
DDI3 Uniform Resource Names: Locating and Providing the Related DDI3 Objects Part of Session: DDI 3 Tools: Possibilities for Implementers IASSIST Conference,
IASSIST / IFOD: Mobile Data and the Life Cycle – Tampere, Finland May 26-29, 2009 Lifecycle & Comparative Studies Metadata Needs of the Future CESSDA RI.
Wendy Thomas Minnesota Population Center NADDI 2014.
Inside View of DDI Version 3.0: Structural Reform Group Report Presented to IASSIST 25 May 2005 Edinburgh Scotland UK.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
[Meta]-Data Management using DDI
DDI URN Enabling identification and reuse of DDI metadata IDSC of IZA/GESIS/RatSWD Workshop: Persistent Identifiers for the Social Sciences Joachim Wackerow.
Reusable!? Or why DDI 3.0 contains a recycling bin.
DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,
Präsentationstitel IAB-ITM Find the right tags in DDI IASSIST 2009, 27th-30th Mai 2009 IAB-ITM Finding the Right Tags in DDI 3.0: A Beginner's Experience.
Codebook Centric to Life-Cycle Centric In the beginning….
Managing the Metadata Lifecycle The Future of DDI at GESIS and ICPSR Peter Granda, ICPSR Meinhard Moschner, GESIS Mary Vardigan, ICPSR Joachim Wackerow,
Reducing Metadata Objects Dan Gillman November 14, 2014.
DDI Does it have a life beyond IASSIST? IASSIST/IFDO 2005 Edinburgh Edinburgh February 11, 2004 Ernie Boyko NESSTAR Americas Ottawa May, 2005.
 Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals.
Q: What objects documented by DDI should be citable? All versionable objects, some may not be used Q: What elements are needed in DDI and CDISC to support.
The education variables in the European Social Survey: Advantages in using the DDI for documentation Hilde Orten and Hege Midtsæter Norwegian Social Science.
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
Modernizing the Data Documentation Initiative (DDI-4) Dan Gillman, Bureau of Labor Statistics Arofan Gregory, Open Data Foundation WICS, 5-7 May 2015.
Course on DDI 3: Putting DDI to Work for You 8 December 2010 Wendy Thomas, Minnesota Population Center 2 nd Annual European DDI Users Group Meeting Learning.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
ISO as the metadata standard for Statistics South Africa
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance.
ESCWA SDMX Workshop Session: Role in the Statistical Lifecycle and Relationship with DDI (Data Documentation Initiative)
Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,
Overview of DDI Arofan Gregory METIS October 5-7, 2011.
Locating objects identified by DDI3 Uniform Resource Names Part of Session: Concurrent B2: Reports and Updates on DDI activities 2nd Annual European DDI.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
DOI Registration for Social and Economic Data da|ra Brigitte Hausstein GESIS Leibniz-Institute for the Social Sciences, Berlin.
2nd Annual European DDI Users Group Meeting Utrecht, 8-9 December 2010 (DDI-CVG)
DDI: Capturing metadata throughout the research process for preservation and discovery Wendy Thomas NADDI 2012 University of Kansas.
3 rd Annual European DDI Users Group Meeting, 5-6 December 2011 The Ongoing Work for a Technical Vocabulary of DDI and SDMX Terms Marco Pellegrino Eurostat.
DDI 3.0 Overview Sanda Ionescu, ICPSR. DDI Background Development History 1995 – A grant-funded project initiated and organized by ICPSR proposes to create.
Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.
DDI-RDF Discovery Vocabulary A Metadata Vocabulary for Documenting Research and Survey Data Linked Data on the Web (LDOW 2013) Thomas Bosch.
4 April 2007METIS Work Session1 Metadata Standards and Their Support of Data Management Needs Daniel W. Gillman Bureau of Labor Statistics Paul Johanis.
North American Profile: Partnership across borders. Sharon Shin, Metadata Coordinator, Federal Geographic Data Committee Raphael Sussman; Manager, Lands.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
SDMX Standards Relationships to ISO/IEC 11179/CMR Arofan Gregory Chris Nelson Joint UNECE/Eurostat/OECD workshop on statistical metadata (METIS): Geneva.
Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences DC Thomas Bosch GESIS – Leibniz.
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
Creating Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards Ruth Duerr, NSIDC MiQun Yang, THG Azhar Sikander,
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Documentation and Cataloguing in Data.
DDI and the Lifecycle of Longitudinal Surveys Larry Hoyle, IPSR, Univ. of Kansas Joachim Wackerow, GESIS - Leibniz Institute for the Social Sciences.
Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development.
University of Alberta, Canada Australian Bureau of Statistics (ABS) Australian Data Archive (ADA) University of California, Berkeley --
A look to the past for the future- The North American Profile Sharon Shin Metadata Coordinator Federal Geographic Data Committee.
Marion Wittenberg – DANS Merja Karjalainen – SND.
TIC Updates EDDI 2010 Wendy Thomas – 6 Dec Schedule and Process Changes Production schedule is moving to: – Summer / Winter release schedule January.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall

Metadata for research outputs management
Wendy Thomas NADDI 2012 University of Kansas
Enabling direct data access to social science research data
Metadata in Digital Preservation: Setting the Scene
in the data production process
Arofan Gregory METIS October 5-7, 2011
Proposal of a Geographic Metadata Profile for WISE
Semantic Statistics DDI Lifecycle: Moving Forward Outcome of the Recent Workshops in Dagstuhl Joachim Wackerow.
The role of metadata in census data dissemination
Presentation transcript:

DDI – Metadata for social science data Wolfgang Zenk-Möltgen GESIS – Leibniz Institute for the Social Sciences DataCite Summer Meeting 2010 – Making datasets visible and accessible Hannover, 7-8 June 2010

About DDI Basic DDI Concepts Identification and Citation Application Examples Topics Acknowledgement: DDI Alliance TIC members, namely Wendy Thomas, Arofan Gregory, Joachim Wackerow

About DDI DDI – Data Documentation Initiative The Data Documentation Initiative (DDI) is an effort to create an international standard for describing social science data. Expressed in XML, the DDI metadata specification now supports the entire life cycle of social science datasets. DDI metadata accompanies and enables data conceptualization, collection, processing, distribution, discovery, analysis, repurposing, and archiving. (Stefan Kramer)

History of DDI Concept of DDI and definition of needs grew out of the data archival community Established in 1995 as a grant funded project, initiated and organized by ICPSR February 2003 – Formation of DDI Alliance –Membership based alliance –Formalized development procedures

Members of DDI Initial members –Social science data archives –Statistical data producers Actual membership expanded by –Research data centers –Data producers –Commercial organizations University of Alberta, Canada Australian Bureau of Statistics (ABS) Australian Social Science Data Archive (ASSDA) University of California, Berkeley -- Computer-Assisted Survey Methods Program and UCDATA University of California, California Digital Library Centro De Investigaciones Sociologicas (CIS), Spain CEPS/INSTEAD -- Luxembourg Cornell University (CISER) Danish Data Archive Data Archiving and Networked Services (DANS), The Netherlands Finnish Social Science Data Archive German Socio-Economic Panel Study (SOEP) GESIS - Leibniz Institute for the Social Sciences University of Guelph Institute for Quantitative Social Science (IQSS) at Harvard University Institute for the Study of Labor (IZA) Inter-university Consortium for Political and Social Research (ICPSR) Massachusetts Institute of Technology (MIT) University of Minnesota, Minnesota Population Center National Opinion Research Center (NORC) Norwegian Social Science Data Service (NSD) Open Data Foundation Princeton University Research Data Centre of the German Federal Employment Agency, Institute for Employment Research (IAB) Roper Center Stanford University Survey Research Operations, University of Michigan Swedish National Data Service (SND) Swiss Foundation for Research in Social Sciences (FORS) United Kingdom Data Archive University of Toronto University of Wisconsin U.S. Bureau of Labor Statistics (Associate Member) World Bank, Development Data Group (DECDG) Yale University

DDI is being used around the world Archives and Data Libraries Research Institutes and Data Service Centers International Organizations and National Statistical Agencies

DDI Versions 2000 – DDI 1.0 –Documentation of simple surveys, microdata only 2003 – DDI 2.0 and 2.1 –Extension to aggregate data –Support for geographic material 2008 – DDI 3.0 –Lifecycle model: Shift from the codebook centric / variable centric model to capturing the lifecycle of data –Focus on metadata creation and re-use –Machine-actionable aspects of DDI to support programming –CAI instruments supported by expanded description of the questionnaire –Data series support (longitudinal surveys, panel studies, etc.) –Support comparison by design and comparison-after-the-fact –Improved support for describing complex data files 2009 – DDI 3.1 –Correction of bugs –Introduction of final URN structure to ensure persistent URNs for all identified elements

Basic DDI 3 Concepts Lifecycle Concept Re-usable documentation –Modules –Maintainables, versionables, identifiables –Scheme-based (maintainable lists) Relations to other standards Controlled Vocabularies

The Data Life Cycle CollectionConceptProcessingDistributionDiscoveryAnalysis Archiving Repurposing

DDI 3 versus earlier versions Previous versions had the codebook idea that creates a documentation of a social science dataset DDI 3 with its lifecycle model allows for documentation at all stages from study conception and data processing until analysis and repurposing of data DDI 3 uses XML Schemas instead of XML Data Type Definition (DTD) to have a stronger definition of metadata types, to make better reuse of content and to reach the goal of machine actionability A DDI 3 instance includes now the simple instance from previous DDI versions. Multiple data products can be included for a single study.

DDI 3.1 Modules Contain groups of related documentation elements Some are related to the Lifecycle model, some are technically grouped Archive module Comparative module Conceptual components module Data collection module Dataset module Dublin Core Elements module DDI profile module Grouping module Instance module Logical product module Physical data product module –(plus inline n-cube, normal n-cube, tabular n-cube module and proprietary module) Physical instance module Reusable module Study unit module

Usage of DDI 3 Modules Study Unit Identification Coverage –Topical –Temporal –Spatial Conceptual Components –Universe –Concept –Representation (optional replication) Purpose, Abstract, Proposal, Funding Data Collection Methodology Question Scheme –Question –Response domain Instrument –using Control Construct Scheme Coding Instructions –question to raw data –raw data to public file Interviewer Instructions Logical Product Category Schemes Coding Schemes Variables NCubes Variable and NCube Groups Data Relationships Physical Data Structure Links to Data Relationships Links to Variable or NCube Coordinate Description of physical storage structure –in-line, fixed, delimited or proprietary Physical Instance One-to-one relationship with a data file Coverage constraints Variable and category statistics Archive Organization or individual which has control over the metadata Lifecycle events Archive specific information etc…

Maintainables, Versionables, Identifiables Inheritance Maintainables (may be maintained separately, need agency) Versionables (may be versioned in the form 1.0.0) Identifiables (may be identified and be referenced, either by ID or URN) Other DDI elements Inheritance

DDI Schemes Schemes = Lists of elements of one type Examples archive –OrganizationScheme datacollection –QuestionScheme –ControlConstructScheme –InterviewerInstructionScheme conceptualcomponent –ConceptScheme –UniverseScheme –GeographicStructureScheme –GeographicLocationScheme logicalproduct –CategoryScheme –CodeScheme –VariableScheme –NCubeScheme physicaldataproduct –PhysicalStructureScheme –RecordLayoutScheme

Relationship to Other Standards Dublin Core –Basic bibliographic citation information –Basic holdings and format information METS –Upper level descriptive information for managing digital objects –Provides specified structures for domain specific metadata OAIS –Reference model for the archival lifecycle PREMIS –Supports and documents the digital preservation process ISO – Geography (FGDC) –Metadata structure for describing geographic feature files such as shape, boundary, or map image files and their associated attributes ISO/IEC –International standard for representing metadata in a Metadata Registry –Consists of a hierarchy of concepts with associated properties for each concept SDMX –Exchange of statistical information (time series/indicators) –Supports metadata capture as well as implementation of registries

Contr. Vocab Not part of standard Recommendations on: Example: TimeMethod may be –Longitudinal (Cohort or Trend) –Panel (Continuous or Interval) –TimeSeries (Continuous or Discrete) –CrossSectional –CrossSectionalAdHocFollowUp –Other LifeCycleEventType CommonalityTypeCoded TimeMethod ResponseUnit AggregationMethodsType DataType SoftwarePackage CharacterSet CategoryStatistic SummaryStatistic AnalysisUnit

Identification in DDI 3 Two possibilities to identify an element: –Specify the Tag Agency and Version are inherited –Use the specially-structured URN Agency and Version must be included The structured URN approach is preferred These IDs/URNs can be referenced Both ways need a resolver service that turns the names into locations to make effective re-use possible DDI Alliance ist currently working on that, based on the DNS (Domain Name System) infrastructure approach

URN Identification Examples URN of a maintained object To identify of a variable scheme in DDI 3 via a URN would be as follows: urn=urn:ddi:us.icpsr:VariableScheme.V_GENDER_SCHEME URN of an versionable object All versionable objects are contained within maintainable objects. To identify a variable in DDI 3 via a URN would be as follows: urn=urn:ddi:us.icpsr.VariableScheme. V_GENDER_SCHEME.1.0.0:Variable.Gender URN of an identifiable object An identifiable object may be a direct child of a maintainable object or be contained by a versionable object within a maintainable object. The full path should be provided to facilitate locating the item when referenced. To identify the identifiable object in the above hierarchy in DDI 3 via a URN would be as follows: urn=urn:ddi:us.icpsr:DataCollection.DC_ :TimeMethod.TM_ (from the DDI Technical Specification Part I)

Citation in DDI

OtherMaterial Elements Citationholds full citation information for the external object ExternalURLReferencelocation of the external object ExternalURNReferenceURN expression for the external object MIMETypethe standard internet MIME type for applications Relationshipreference to DDI object and description of relation to it Segmentspecifies part of external object (e.g. with audio/video files) UserID unique ID of other types, e.g. DOI Attributes Actionused for local overrides in case of inheritance ("Add" | "Update" | "Delete") id DDI ID of the element isIdentifiable fixed value of "true" objectSourcesource name or location typerequired type code for type of the external object urnDDI URN of the element xml:langoptional identification of the language of the external object

DOIs and DDI URNs Relationship still unclear DDI URN resolution service still needed Every identifiable element could be registered with a DOI, that would result in huge amounts of DOIs Only study level could be registered with a DOI, e.g. each StudyUnit In DDI all registered DOIs should be documented Vice versa each DOI should contain the DDI URN in the metadata Diverse software applications will make use of them

Application Examples Enhanced Publications –Providing Information to connect Publications with the underlying datasets/variables used –Making retrieval of research with specific datasets/variables possible Version History of Datasets –Documenting errata and correction history –Making it easy to cite used data

Supporting Enhanced Publications Publications with References to Data: DDI 3.1 URN contains: Agency Object Version URL of Documentatio n and/or Data URL of Documentatio n and/or Data DDI Alliance find agency gesis.de.ddi return resolver address find object return URL request document return document Publication with References (URNs) urn:ddi:de.gesis:VariableScheme.ZA3811_VarSch.1.0.0:Variable.V

Supporting Enhanced Publications DSDM DDI 3 EPE Simple Export Wizard 1.2.0

Enhancing Publications - DatapluS A University of Tilburg and Centerdata project, supported by GESIS and the European Values Study

Version History of Datasets The GESIS data catalogue holds study descriptions with links to data access GESIS currently introduces a common versioning policy for datasets Starting with version and increasing the major, minor or revision number according to change in the dataset Corresponding to each published version a DOI will be created That gives transparancy in the history of data processing Citation of used datasets will include the specific version to ease replication

Data Catalogue

Thank you!