Steps towards a Single Point of Access for Survey Questions across Europe: The Euro Question Bank Project Wolfgang Zenk-Möltgen Azadeh MahmoudHashemi GESIS.

Slides:



Advertisements
Similar presentations
Metadata Management at GESIS-ZA Reiner Mauer GESIS – Data Archive and Data Analysis CESSDA-Expert Seminar Odense, September 11th 2008.
Advertisements

Metadata at ICPSR Sanda Ionescu, ICPSR.
CESSDA Question Databank Tender, results and future Maarten Hoogerwerf, CESSDA expert seminar 2009.
Stefania Bergamasco, Cecilia Colasanti An integrated approach to turn statistics into knowledge combining data warehouse, controlled vocabularies and advanced.
STARDAT DATA ARCHIVING SUITE European Survey Research Association (ESRA), July 18 – 22, 2011, Lausanne, Switzerland Monika Linne, Evelyn Brislinger, Wolfgang.
Discove r Humanities and Social Science Electronic Thesaurus - HASSET Faceted search HASSET is the subject thesaurus that the UK Data Service uses to index.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Information Retrieval in Practice
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
Microsoft ® Official Course Interacting with the Search Service Microsoft SharePoint 2013 SharePoint Practice.
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
Overview of Search Engines
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
World Bank: Microdata Library Development Data Group.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
DEF System Architecture XML Web Services Fedora and the Zebra Search Engine in an OAI Eprints Application by Gert Schmeltz Pedersen, DTV
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Using XML technologies to implement complex tables in short- term statistics Francesco Rizzo
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
FlexElink Winter presentation 26 February 2002 Flexible linking (and formatting) management software Hector Sanchez Universitat Jaume I Ing. Informatica.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
3 Copyright © 2004, Oracle. All rights reserved. Working in the Forms Developer Environment.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
UI's for inputting and presenting the metadata of hypermedia documents Kai Kuikkaniemi HUT T
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Using JSTOR May What is JSTOR?JSTOR 2.JSTOR demonstration −Searching JSTOR −Format of the journal content −Linking to content on JSTOR 3.Help.
PubMed Database Interface (Basic Course Module 4).
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Metadata standards Using DDI to Inform, Organize, and Drive Survey Data Production.
The Holmes Platform and Applications
Information Retrieval in Practice
Publishing DDI-Related Topics Advantages and Challenges of Creating Publications Joachim Wackerow EDDI16 - 8th Annual European DDI User Conference Cologne,
Using JSTOR May 2016.
Rich metadata from the start
Working in the Forms Developer Environment
Using E-Business Suite Attachments
Using computers to search electronic databases
Summon discovers contents from one search box!
Improvements to Search
Building Search Systems for Digital Library Collections
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
PDAP Query Language International Planetary Data Alliance
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
What’s New in Colectica 5.3 Part 1
Questasy: Documenting and Disseminating Longitudinal Data Online with DDI 3 Edwin de Vet 11/14/2018.
Search Techniques and Advanced tools for Researchers
The Re3gistry software and the INSPIRE Registry

Enhancing ICPSR metadata with DDI-Lifecycle
Enabling direct data access to social science research data
CESSDA Workplan: Metadata Harvesting Tool
2. An overview of SDMX (What is SDMX? Part I)
Data Model.
Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta
Updates on the XSLT stylesheets for DDI
RODA.
Question Banks, Reusability, and DDI 3.2 (Use Parameters)
EDDI12 – Bergen, Norway Toni Sissala
Database Design Hacettepe University
USER MANUAL - WORLDSCINET
Prepared by Peter Boško, Luxembourg June 2012
Academic & More Group 4 谢知晖 王逸雄 郭嘉宋 程若愚.
Rational Publishing Engine RQM Multi Level Report Tutorial
PubMed Database Interface (Basic Course: Module 4)
Work Session on Statistical Metadata (Geneva, Switzerland May 2013)
European Statistical System Metadata Handler ESS MH (Super) Providers
Palestinian Central Bureau of Statistics
USER MANUAL - WORLDSCINET
Presentation transcript:

Steps towards a Single Point of Access for Survey Questions across Europe: The Euro Question Bank Project Wolfgang Zenk-Möltgen Azadeh MahmoudHashemi GESIS FSD FORS UKDA DDA NSD DANS TARKI SND Consortium of European Social Science Data Archives License: CC BY 4.0 (exceptions see last slide)

Agenda Introduction Purposes Features Use cases Architecture

Introduction of Euro Question Bank EQB content is provided by CESSDA Service Providers Contains survey questions of different datasets in different languages Contains associated information about studies, datasets, variables, etc Expand on existing QDB for social science survey research Offer databases of well-documented surveys and variables

Purposes of Euro Question Bank Develop and implement a central search facility across all CESSDA surveys Covering questions of surveys as much as possible Exploration of findings on particular topics to identify existing survey items Retrieval will be used for looking up question text, building new questionnaires or compare questions

Question Data bank Partners TARKI UKDA NSD DDA New partners with animation GESIS SND FORS DANS FSD

Key Features of CESSDA EQB Based on DDI-Lifecycle metadata standard Provide conversion tool from DDI-Codebook metadata standard Search and filter by keywords, survey or series title, data collection dates or countries, question types or languages, provider, data availability, etc

Key Features of CESSDA EQB Assist in searching in different languages by integrating multilingual thesaurus (ELSST) Discovered questions include study-level, citation, frequencies, multilingual documentation and links to full original questionnaires Access to CESSDA resources without switching systems

Basic Functions A highly relevant tool for any kind of harmonization work An important module in CESSDA Data portal Present different versions of a question for immediate comparison EQB will support the development of new questionnaire by giving researchers access to a selection of questions

Use Cases Import questions, variables, study from DDI-XML file  Actor: Harvester  Import through OSMH Update questions, variables and study with import from DDI-XML file into DB, keep different version  Actor: Provider  Update automatic through harvester Approach to build the use cases was to define the minimum viable product This was a selection of the boarder list of functionalities GESIS provide a possibility to import a specific type of DDI 3.2 No additional type of “older” metadata kept in EQB Versioning  End users are only interested in fielded versions of questions. The longitudinal study questions which have been modified over the waves, can be seen.

Use Cases Find questions, variables, study and concepts in EQB Actor: End user Filter and search facets on questions, variables, study and concepts in DB  Actor: End user 3. EQB focuses on questions, not on variables. OSMH will have different repository handlers for DDL-C and DDI-L metadata. Therefore archives have flexibility of defining what constitutes a question. Some archives might represent sub-questions of complex questions as individual question included Concept search should provide predictive term suggestions based on keywords actually used in metadata 4. Primary Obj of interest is Q. Study and variables can be treated as attributes, can be used for filtering and facet searches  Straight-forward technically. EQB offers both simple and advanced search  Attributes to use for facet searches (FA) or as filters (FI) Study/group/series title (FA) Question texts (targets ‘question text’, not pre-question or post-question text) (FA) Response categories (FA) Question language (Language of Research Instrument or File language??) (FI) Fielded vs. translated questions (FI) Country of collection (Nation) (FA or FI?) Concept (optional) (FA) Mode of collection  (relevant classes only) (FI) Time method (relevant classes only) (FI) Time frame (collDate at study level and/or Time period covered) (FA or FI?)

Use Cases Compare documentation of questions, variables, study and concepts Actor: End user Explore relations between questions and question groups in a study  Actor: End user 5. Select Qs and see them side by side. e.g. ICPSR & UKDA. EQB will not provide any analysis of differences In comparison page there should at least be the study title function as a direct link to detailed study description. 6. Select Q of interest to a basket to export questions, session must not end automatically

Use Cases Output for a new module, questionnaire or questions Actor: End user See how special metadata has developed and repeated in a survey  Actor: End user 7. Display different versions (as defined by DDI 3.2 version of a question)

Use Cases Translation for a new survey or a question related to a specific concept as well as switch the language to view other translation of the same question  Actor: End user Display related studies, datasets, variable names and lables which are used for specific questions 9. To be able to see different language versions of fielded questions, e.g. for doing new questionnaire translation, writing articles, designing a new module, comparing translations or language-country pair variants, etc. 10. Archives have information about which study/dataset a given question is connected to, and info about variables linked to a question.

Use Cases Usage statistics  Actor: Provider Sort result list by preference  Actor: End user 11. e.g. How much each filter has been used 12. Order the result list by preferences

EQB Architecture EQB – Frontend Vaadin UI EQB – Backend Elsatic Search DDI-FlatDB Open Source Metadata Harvester (OSMH)

EQB Architecture

EQB - Frontend User interface for faceted search Similar to GESIS GLES question search gles.gesis.org Web services  Vaadin as UI technology Interacting between different web services UX for user similar to desktop application User interface for faceted search Similar to GESIS GLES Question Search Example see gles.gesis.org Model and backend not recommended for reuse Beginning of DDI-L model development with domain classes   Possible reuse Our UI model for faceted query building OR alternative framework support to query search index Vaadin as UI technology + UX for user similar to desktop application + Component-oriented UI development possible + More stable code (avoid JS by developers) - Statefull, object tree of user stored in his session

EQB - Frontend Current state: Vaadin UI with simple search on study title implemented (accessing Elastic search index)

EQB - Frontend

EQB - Frontend

EQB - Frontend

EQB - Backend Elastic search  Indexing  an open-source full-text search library Search engine with JSON over HTTP web interface Fast search responses using inverted index DDI-FlatDB  Idea is to access the entities very fast without any problem with DDI version or MySQL DB. ES is able to achieve fast search responses, because instead of searching the text directly, it searches an index instead This is like retrieving pages in a book related to a keyword by scanning the index at the back of the book, as opposed to searching every word An index consists of one or more Documents, and a Document consists of one or more Fields In DB, a Document corresponds to a table row, and a Field corresponds to table column In ES an index may store documents of different “mapping types”. A mapping types is a way of separating the documents in an index not logical groups.

EQB - Backend DDI-FlatDB  The DDI-Flat DB architecture is abstract, efficient, functional driven and REST-Full access to studies in DDI format. Store Question, variable and study entities Accessing and loading faster and easily Current state: Elastic search index implemented (accessing Harvester), FlatDB implemented DDI-FlatDB is flexible by all functionalities, models and configurations. The current DDI files are heterogeneous and varies over different versions.

Open Source Metadata Harvester (OSMH) Harvest all information from heterogeneous and autonomous handlers (SPs) with different technologies A CESSDA MH classify the entities and objects which gets harvested Repository Handlers Enables repository owners to write RHs for repository technology they use Current state: Repository Handlers for NESSTAR servers implemented Partners within the project extend the existing OSMH by additional metadata fields to support our use cases

Questions?

Thank you for your attention

License This presentation is offered under license CC-BY 4.0. The license does not apply to the following copyrighted material used in this presentation: The logo of GESIS The logo of CESSDA The slideshow layout of CESSDA