Presentation is loading. Please wait.

Presentation is loading. Please wait.

 Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals.

Similar presentations


Presentation on theme: " Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals."— Presentation transcript:

1

2  Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals for today

3 Introduction DDI 3 Background XML Background How DDI 3 documents survey instruments Creating DDI3 and Documentation Manual Markup Using functionality from CAI systems Custom development Colectica Discussion Questions and discussion Additional documentation activities

4 Data Documentation Initiative DDI3 Background

5 Background Concept of DDI and definition of needs grew out of the data archival community Established in 1995 as a grant funded project initiated and organized by ICPSR Members: –Social Science Data Archives (US, Canada, Europe) –Statistical data producers (including US Bureau of the Census, the US Bureau of Labor Statistics, Statistics Canada and Health Canada) February 2003 – Formation of DDI Alliance –Membership based alliance –Formalized development procedures Copyright © 2008 GESIS

6 Origins of the DDI Alliance Versions 1.* and 2.* were developed by an informal network of individuals from the social science community and official statistics –Funding was through grants It was decided that a more formal organization would help to drive the development of the standard forward –Many new features were requested –The DDI Alliance was born to facilitate the development in a consistent and on-going fashion Copyright © 2008 GESIS

7 Requirements for 3.0 Improve and expand the machine-actionable aspects of the DDI to support programming and software systems Support CAI instruments through expanded description of the questionnaire (content and question flow) Support the description of data series (longitudinal surveys, panel studies, recurring waves, etc.) Support comparison, in particular comparison by design but also comparison-after-the fact (harmonization) Improve support for describing complex data files (record and file linkages) Provide improved support for geographic content to facilitate linking to geographic files (shape files, boundary files, etc.) Copyright © 2008 GESIS

8 DDI 3.0 and the Data Life Cycle A survey is not a static process: It dynamically evolved across time and involves many agencies/individuals DDI 2.x is about archiving, DDI 3.0 across the entire “life cycle” 3.0 focus on metadata reuse (minimizes redundancies/discrepancies, support comparison) Also supports multilingual, grouping, geography, and others 3.0 is extensible Copyright © 2008 GESIS

9 Development of DDI 3.0 2004 – Acceptance of a new DDI paradigm –Lifecycle model –Shift from the codebook centric / variable centric model to capturing the lifecycle of data –Agreement on expanded areas of coverage 2005 –Presentation of schema structure –Focus on points of metadata creation and reuse 2006 – Presentation of first complete 3.0 model – Internal and public review 2007 – Vote to move to Candidate Version – Establishment of a set of use cases to test application and implementation 2008 – April: DDI 3.0 published Copyright © 2008 GESIS

10  XML: Extensible Markup Language  Designed to transport and store data

11

12

13

14

15 XML Schemas, DDI Modules, and DDI Schemes Copyright © 2008 GESIS Instance Study Unit Physical Instance DDI Profile Comparative Data Collection Logical Product Physical Data Structure Archive Conceptual Component Reusable Ncube Inline ncube Tabular ncube Proprietary Dataset

16 XML Schemas, DDI Modules, and DDI Schemes Copyright © 2008 GESIS Instance Study Unit Physical Instance DDI Profile Comparative Data Collection Logical Product Physical Data Structure Archive Conceptual Component Reusable Ncube Inline ncube Tabular ncube Proprietary Dataset

17 XML Schemas, DDI Modules, and DDI Schemes Copyright © 2008 GESIS Instance Study Unit Physical Instance DDI Profile Comparative Data Collection  Question Scheme  Control Construct Scheme  Interviewer Instruction Scheme Logical Product  Category Scheme  Code Scheme  Variable Scheme  NCube Scheme Physical Data Structure  Physical Structure Scheme  Record Layout Scheme Archive  Organization Scheme Conceptual Component  Concept Scheme  Universe Scheme  Geographic Structure Scheme  Geographic Location Scheme Reusable Ncube Inline ncube Tabular ncube Proprietary Dataset

18 Maintainable Schemes Category Scheme Code Scheme Concept Scheme Control Construct Scheme Geographic Structure Scheme Geographic Location Scheme Interviewer Instruction Scheme Question Scheme NCube Scheme Organization Scheme Physical Structure Scheme Record Layout Scheme Universe Scheme Variable Scheme Packages of reusable metadata maintained by a single agency Copyright © 2008 GESIS

19 Designed to Support Registries A “Registry” is a catalog of metadata resources Resource package –Structure to publish non-study-specific materials for reuse Extracting specified types of information in to schemes –Universe, Concept, Category, Code, Question, Instrument, Variable, etc. Allowing for either internal or external references –Can include other schemes by reference and select only desired items Providing Comparison Mapping –Target can be external harmonized structure Copyright © 2008 GESIS

20 Data Collection Methodology Question Scheme –Question –Response domain Instrument –using Control Construct Scheme Coding Instructions –question to raw data –raw data to public file Interviewer Instructions Question and Response Domain designed to support question banks – Question Scheme is a maintainable object Organization and flow of questions into Instrument – Used to drive systems like CASES and Blaise Coding Instructions – Reuse by Questions, Variables, and comparison Copyright © 2008 GESIS

21 QuestionItem in DDI

22 QuestionItem

23 Opening tag & identification QuestionText NumericDomain

24

25 In a QuestionScheme

26 ControlConstructScheme with QuestionConstructs

27 An Instrument

28 Those all go in a DataCollection element

29 The DataCollection element goes in a StudyUnit, which goes in a DDIInstance or ResourcePackage

30

31  Create QuestionScheme and QuestionItems

32  Create ControlConstructScheme  Add QuestionReferences

33  Add control flow items to ControlConstructScheme  Include a main Sequence element

34  Create the Instrument Element  Add the main ControlConstructReference

35  Create the DDIInstance element  Create the StudyUnit element  Create the DataCollection element  Add the QuestionScheme, ControlConstructScheme, and Instrument to the DataCollection element

36  Check the XML document against the DDI schemas to see if we got it right.

37  We have DDI, now we need documentation

38 Custom DevelopmentMQDSColectica

39

40 Michigan Questionnaire Documentation System (MQDS) Sue Ellen Hansen Nicole Kirgis

41 What Does MQDS Do? Facilitates automated documentation and harmonization of Blaise survey instruments and datasets – Extracts survey question metadata – Standardized format

42 Survey Question Metadata Question universe Variable name and label Question text Question variable text (fills) Data type Code values and code text Skip instructions etc.

43 MQDS Version 1 Extracted metadata from Blaise data model as XML tagged data Provided user interface for selection of – Blaise files – Instrument questions and sections – Types of metadata to extract – Languages to display – Style sheet for generation of instrument documentation or codebook

44 Using MQDS V1 XML: Codebook in Five Languages National Latino and Asian American Study www.icpsr.umich.edu/CPES

45 MQDS Version 1 Limitations – XML not DDI-compliant DDI Version 2 did not have XML tags for all metadata provided by Blaise Did not provide easy means of adding XML tags without becoming noncompliant – XML files for complex surveys can be very large (text files) Entire files had to be processed in computer memory Limited ability to fully automate documentation

46 DDI Version 3 Released April 2008 Focus on complete data lifecycle –going beyond the codebook

47 DDI Version 3 Included extensions proposed by DDI working group on instrument design Persistent Content of QuestionUse of Question in Instrument Question text Static Dynamic or variable Order and routing Sequence / skip patterns Loops Multiple-part questionUniverse Response domain Open Set categories Special types (date, time, etc.) Analysis unit Definitional textInstructions

48 MQDS Version 3 Joint SRC and ICPSR venture Goals: – Address version 2 limitations Process Blaise instrument of any size – Exploit new elements and validate to the recently released DDI version 3 standard – Move from processing XML metadata in memory to streaming metadata to a relational database

49 MQDS Version 3 Relational Database: Import, Export, Transform 3. Transform 1. Import 2. Export XML (DDI 3) User specifies output files (location, Language/locale, XML output options, etc.) Codebook Questionnaire User specifies stylesheet selection criteria, type of output desired (html, rtf, pdf), etc. User specifies input files (location, file type, etc.) Blaise Datamodel (BMI) Blaise Database (BDB) Other File Types (e.g. SAS, SPSS, etc) Relational Db Relational Db SQL Server / SQL Server Express Database connection settings DDI 3 elements not in *.bmi

50 MQDS Version 3 Relational database – DDI compliant standardized tables – Flexibility for SRC and ICPSR to add extensions that meet their specific organizational needs – Allows Automated documentation of any Blaise survey instrument Importing and documenting data produced by other software Lower cost development of other tools that facilitate editing and disseminating data

51 MQDS V3 Prototype: Exporting Language XML

52 MQDS Development Expect to release Summer 2009 Working out a distribution plan for Blaise users


Download ppt " Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals."

Similar presentations


Ads by Google