Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microdata Management Toolkit Tools to facilitate archive and dissemination of surveys A PDF for Data? Metadata Editor / Nesstar Publisher 3.5 CD builder.

Similar presentations


Presentation on theme: "Microdata Management Toolkit Tools to facilitate archive and dissemination of surveys A PDF for Data? Metadata Editor / Nesstar Publisher 3.5 CD builder."— Presentation transcript:

1 Microdata Management Toolkit Tools to facilitate archive and dissemination of surveys A PDF for Data? Metadata Editor / Nesstar Publisher 3.5 CD builder Guidelines for Archiving & Dissemination Session E2 - Thursday, 26 May Tools for Preservation: Integration and Assessment Preserving and improving the access to large and complex household surveys Mark Diggory Olivier Dupriez Pascal Heus Jostein Ryssevik Harvard / MIT Data Center World Bank World bank Nesstar Ltd. mdiggory@latte.harvard.edu odupriez@worldbank.org pascal.heus@gmail.com Jostein.Ryssevik@nsd.uib.no

2 Background Sponsored by World Bank / International Household Survey NetworkSponsored by World Bank / International Household Survey Network –Presented earlier this week –Created in September 2004 –International organizations actively sponsoring household surveys –Marrakech Action Plan for Statistics –http://www.surveynetwork.org http://www.surveynetwork.org Survey often under-used: limited access for users which leads to poor return on investment limited impact on the ground, difficulties in policy makingSurvey often under-used: limited access for users which leads to poor return on investment limited impact on the ground, difficulties in policy making Common obstacles: quality, technical capacity, legal/political issuesCommon obstacles: quality, technical capacity, legal/political issues Common problems:Common problems: –Accessibility, Timeliness, Coherence –Lack of metadata / documentation / data –Poorly organized archives To address technical issues: Need for new tools and guidelines  Microdata Management ToolkitTo address technical issues: Need for new tools and guidelines  Microdata Management Toolkit

3 Toolkit Requirements User friendly software suite and guidelines to archive and disseminate microdataUser friendly software suite and guidelines to archive and disseminate microdata Facilitate metadata exchange: compliant with common XML specifications (DDI, Dublin Core)Facilitate metadata exchange: compliant with common XML specifications (DDI, Dublin Core) Facilitate archiving: put together metadata and data, address common quality control issuesFacilitate archiving: put together metadata and data, address common quality control issues Facilitate dissemination: simple to redistribute on cd/dvd and the web, answer producer/depositor needs (subset, anonymization, quality control)Facilitate dissemination: simple to redistribute on cd/dvd and the web, answer producer/depositor needs (subset, anonymization, quality control) Works with common data formats (spss, sas, stata, statistica, cspro/imps/issa)Works with common data formats (spss, sas, stata, statistica, cspro/imps/issa) Multilingual supportMultilingual support Free or InexpensiveFree or Inexpensive Availability of technical support and trainingAvailability of technical support and training Accompanied with guidelines and training programAccompanied with guidelines and training program Supported by national, international and research communitiesSupported by national, international and research communities

4 Core file format - A PDF for Data? How can we carry around the information?How can we carry around the information? Looking at documents  PDFLooking at documents  PDF Can we do the same for data?Can we do the same for data? –Yes, a Nesstar file holds data + metadata! Partner with Nesstar Ltd to develop new toolsPartner with Nesstar Ltd to develop new tools –Why: strong tool for metadata management, available today, community acceptance, technical support, past experience Development agreementDevelopment agreement –Enhance existing publisher software and make available as a stand alone product –Open binary file format (not a black box) and availability of API –Free data reader (like pdf) that allows user to access at the data and metadata and convert to their favorite format –Special licensing agreement for developing countries

5 Toolkit Components Archiving: Metadata Editor (World Bank / Nesstar Ltd.)Archiving: Metadata Editor (World Bank / Nesstar Ltd.) –To compile survey data, documentation and metadata in a standard format (Nesstar/DDI). Free data reader for users. –Built on Nesstar Publisher Dissemination: CD Builder (World Bank / Mark Diggory)Dissemination: CD Builder (World Bank / Mark Diggory) –To facilitate the publication of survey data, documentation and metadata on CD-ROM and on the web (transforms DDI into HTML based navigation) –Based on Eclipse Platform, open source Guidelines: Handbook (World Bank / ICPSR)Guidelines: Handbook (World Bank / ICPSR) –To provide data producer with information on policies and legal aspect of data dissemination, guidelines to document datasets and recommendations in setting up a data archive

6 Generate HTML based CD-ROM Import metadata and prepare CD-ROM Import data and compile metadata The Toolkit Process 1 2 3

7 What is the Nesstar Publisher? Advanced data management programAdvanced data management program DDI /DC Metadata authoring toolDDI /DC Metadata authoring tool Import/Export to common data formatsImport/Export to common data formats Standalone or w/Nesstar serverStandalone or w/Nesstar server http://www.nesstar.comhttp://www.nesstar.comhttp://www.nesstar.com Easy editing/creation of DDI documented datasets. No need to know XML. Full DDI import and export for single file/language studies. Templates which lets your organization standardize the use of the DDI. Default texts in templates. Local controlled vocabularies. Possible to share the documentation work between different persons. A Category Repository which lets you share categories within a dataset and between datasets. Variable groups. Easy setting of weights. Frequency and summary statistics output, with options for each variable. Import and export to the most common statistical formats.

8 What is the Metadata Editor? Nesstar Publisher 3.0:Nesstar Publisher 3.0: –A tool to prepare and publish surveys to a Nesstar Server –Sold as a component of the Nesstar Software Suite –Multiple components (editor, hierarchy, cube, resources)  New Model for Version 3.5:  New Model for Version 3.5: –All components integrated under one interface –A study is stored in a single Nesstar file –Enhanced and new functionalities Quality control, computed variables, recodes, anonymize, subsetQuality control, computed variables, recodes, anonymize, subset –Availability of a free Nesstar Data Reader –Produce DDI / Dublin Core (DC) XML documents –Available as a stand-alone software package

9 Editor key features (1) All components integrated under a single interface Import/Export support common data formats All surveys stored as projects in a single tree hierarchy Template driven metadata editor allows for users to decide which DDI/DC elements to use.

10 Editor key features (2) Easy to use interface for document, survey, file and variable metadata editing

11 Editor key features (3) Data import preserves existing dictionary and generates summary statistics Manage variable groups DDI and Dublin Core Metadata import/export DDI and Dublin Core Metadata import/export

12 Editor key features (4) Support for survey documentation as Dublin Core resources Description of a dataset primary keys and hierarchy …and validation of dataset relationships Automatic randomization of primary key variables AND MORE…

13 Data Reader Free softwareFree software PDF philosophyPDF philosophy Access to survey metadataAccess to survey metadata Access to data (no need for specialized software)Access to data (no need for specialized software) Export to common formatsExport to common formats Single file holds data and metadataSingle file holds data and metadata

14 What is the CD Builder? Purpose is to publish survey metadata, documents and data on a CD-Rom (or web site)Purpose is to publish survey metadata, documents and data on a CD-Rom (or web site) Transforms DDI into an HTML based interfaceTransforms DDI into an HTML based interface User can customize the layout (branding) and content of the CD (single or multi- surveys)User can customize the layout (branding) and content of the CD (single or multi- surveys) Open source applicationOpen source application Build on the Eclipse FrameworkBuild on the Eclipse Framework Based on DDI / Dublin CoreBased on DDI / Dublin Core Integrates with Metadata EditorIntegrates with Metadata Editor Easy to useEasy to use

15 CD Builder Process Create new CD-ROM Project Add a survey to the project and select its type and branding 1 2 Selecting a survey consist in opening the DDI-XML or Nesstar file The survey “branding” determines the overall look and feel of the CD The survey “type” determines the default metadata content Selecting a survey consist in opening the DDI-XML or Nesstar file The survey “branding” determines the overall look and feel of the CD The survey “type” determines the default metadata content Click the “Save” button to generate the HTML interface 3 After a few minutes, your CD Project is ready for publishing! 4

16 Key Features Content of CD pages is fully customizable A CD-ROM project can hold several surveys Branding customization Can be published to web Multilingual support Automatic updates …and more… Branding customization Can be published to web Multilingual support Automatic updates …and more…

17 Sample output

18 Handbook Handbook on the Documentation, Dissemination, and Preservation of MicrodataHandbook on the Documentation, Dissemination, and Preservation of Microdata –Part I: Policy, legal and ethical issues and recommendations. Benefits and costs of microdata dissemination –Part II: Technical guidelines: documenting, disseminating and preserving a dataset –Part III: Setting-up a central data archive

19 Benefits and Users (1) What will the toolkit improve?What will the toolkit improve? –Documentation (based on standards, guidelines and validation) –Preservation: data and metadata stay together, CD archiving –Cataloguing: facilitate metadata exchange –Dissemination: CD, DVD, Web –Quality: validation procedures, use of common language, adoption of best practices

20 Benefits and Users (2) Potential users?Potential users? –Survey producers at national level: preservation, dissemination, harmonize framework –International survey sponsors –Data archives Who will benefit?Who will benefit? –Data producers –National & International survey sponsors –Survey data repositories –Data analysts –Policy makers and population –DDI Community

21 Status & Availability Publisher 3.5Publisher 3.5 –Beta version available –Nesstar commercial release during the summer CD BuilderCD Builder –Beta version available –Public release expected in September (Open Source) GuidelinesGuidelines –Draft completed –Review over the summer

22 Next? Distribution, training and adoption of the toolkitDistribution, training and adoption of the toolkit User acceptance tests and pilot sitesUser acceptance tests and pilot sites Release of open source components (Sourceforge, DDI)Release of open source components (Sourceforge, DDI) Future developments:Future developments: –Translations in other languages –Plug-ins for Publisher and/or Reader (open source) –Availability of API library –Basic analytical functionalities (tabulation, graphs, etc.) –Evaluation of disclosure risks / anonymization procedures –Embed document in archive file (?) –Plan for DDI 3.0 support –Bug fixes / enhancements / new features (based on user feedback) –And more based on feedback from users, DDI & open source community Integration of other tools:Integration of other tools: –Argus [confidentiality] –CSPro [production] –Virtual Data Center (VDC) [web based dissemination] Strong collaboration and participation of the communityStrong collaboration and participation of the community

23 Thank you! Mark Diggory Olivier Dupriez Pascal Heus Jostein Ryssevik Harvard / MIT Data Center World Bank World bank Nesstar Ltd. mdiggory@latte.harvard.edu odupriez@worldbank.org pascal.heus@gmail.com Jostein.Ryssevik@nsd.uib.no QUESTION / ANSWER


Download ppt "Microdata Management Toolkit Tools to facilitate archive and dissemination of surveys A PDF for Data? Metadata Editor / Nesstar Publisher 3.5 CD builder."

Similar presentations


Ads by Google