Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John.

Similar presentations


Presentation on theme: "Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John."— Presentation transcript:

1 Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John Gawalt FedCASIC Workshops, BLS March 17, 2010 National Science Foundation Division of Science Resources Statistics 0

2 1984 National Science Foundation Division of Science Resources Statistics 1

3 1,984 National Science Foundation Division of Science Resources Statistics 2

4 1 National Science Foundation Division of Science Resources Statistics 3

5 1 9 National Science Foundation Division of Science Resources Statistics 4

6 1 9 8 National Science Foundation Division of Science Resources Statistics 5

7 National Science Foundation Division of Science Resources Statistics 6

8 Today’s Talk National Science Foundation Division of Science Resources Statistics A bit about SRS Historical perspective of data and metadata dissemination Metadata users and their metadata needs Standardization efforts Challenges and future vision 7

9 A bit about the Division of Science Resources Statistics (SRS) National Science Foundation Division of Science Resources Statistics Federal Statistical agency within NSF 11 periodic data collections on the U.S. Science and Engineering enterprise Data dating back to the 1950s 8

10 Historical Perspective of SRS data and metadata dissemination National Science Foundation Division of Science Resources Statistics 1950s – early 1990s paper only Detailed statistical tables with minimum metadata as footnotes Publications included  Highlights about the survey  Scope and method of survey  Questionnaire  Cover letters 9

11 Example s publication National Science Foundation Division of Science Resources Statistics 10

12 1990’s thru 2000’s National Science Foundation Division of Science Resources Statistics 1992 – electronic format Detailed statistical tables in spreadsheets with minimum metadata as footnotes Kept paper, added electronic text Survey Methodology, Limitations to the data, Definitions, Historical revisions, List of tables PDF added Questionnaire, Cover letters, Instructions 11

13 Example PDF National Science Foundation Division of Science Resources Statistics 12

14 Example – 1991 Electronic spreadsheet National Science Foundation Division of Science Resources Statistics 13

15 Example – 1991 text National Science Foundation Division of Science Resources Statistics 14

16 Today National Science Foundation Division of Science Resources Statistics Source data tables in Excel with footnotes HTML / PDF  Highlights of the survey  Links to references  Survey description PDF  Survey Questionnaire  Instructions  Definitions 15

17 Example – 2007 Excel spreadsheet National Science Foundation Division of Science Resources Statistics 16

18 Example SIRD1 National Science Foundation Division of Science Resources Statistics 17

19 Example – 2007 HTML National Science Foundation Division of Science Resources Statistics 18

20 Example – 2007 PDF National Science Foundation Division of Science Resources Statistics 19

21 BUT THAT’S NOT ALL National Science Foundation Division of Science Resources Statistics Electronic databases  Create and download your own customized aggregate tables Public use files  Access to some microdata series 20

22 National Science Foundation Division of Science Resources Statistics 21

23 Metadata in WebCASPAR …. National Science Foundation Division of Science Resources Statistics 22

24 Metadata in WebCASPAR National Science Foundation Division of Science Resources Statistics Variable specific metadata available under Info link Metadata not tightly integrated with the data itself – does not get downloaded with the data 23

25 WebCASPAR Taxonomy National Science Foundation Division of Science Resources Statistics Survey specific taxonomies NCES IPEDS Classification of Instructional program codes (CIP) Integrated taxonomy for querying across surveys 24

26 National Science Foundation Division of Science Resources Statistics 25

27 National Science Foundation Division of Science Resources Statistics 26

28 Metadata in SESTAT National Science Foundation Division of Science Resources Statistics Metadata Explorer is separate from the data  Individual variable information  Description  Question  Domain/Availability – history  Valid response categories  Keywords Metadata is not tightly integrated with the data itself – it does not get downloaded with the data 27 https://sestat.nsf.gov/sestat/sestat.html

29 Example -- Public Use file National Science Foundation Division of Science Resources Statistics 28

30 Example -- Public Use file National Science Foundation Division of Science Resources Statistics 29

31 Summary – Where are we? National Science Foundation Division of Science Resources Statistics Different surveys have evolved differently  Varying levels of details/metadata Not in an standardized structure Hodge-podge 30

32 National Science Foundation Division of Science Resources Statistics 31 Metadata Users & Their Metadata Needs Not a one-to-one relationship, but many-to-many They occur at all stages of the survey process

33 Process Data National Science Foundation Division of Science Resources Statistics Define research objectives Choose mode of collection Choose sampling frame Construct and pretest questionnaire Design and select sample Develop Survey Instrument Develop Sample Design 32 Survey Process Source: Survey Methodology (2009) Groves, Fowler, Couper, Lepkowski, Singer & Tourangeau. Recruit and measure sample Code and edit data Make postsurvey adjustments Perform analysis Define Scope Collect Data Disseminate Data

34 Define Scope National Science Foundation Division of Science Resources Statistics Users Metadata Data User General Survey Manager Topic Subject Matter Expert Population of interest Statistician Other data sources Survey Methodologist Specific Respondent Frame options Sample design options Historical info/data User needs Federal Register notices 33

35 Develop Survey Instrument National Science Foundation Division of Science Resources Statistics Users Metadata Data User Questions Survey Manager Answer choices Subject Matter Expert Definition of terms Statistician Instructions Survey Methodologist Logic flow of questions Respondent Cognitive work Validity assessments Reliability assessments Functionality testing Alternative questions Instrument design specs – paper, web, CATI

36 Develop Sample Design National Science Foundation Division of Science Resources Statistics Users Metadata Data User Population of interest Survey Manager Sampling frame / Universe specs Subject Matter Expert Update schedule Statistician Sample design specs Desired criteria Sample selection techniques Historical information on performance of designs Estimation methods 35

37 Collect Data National Science Foundation Division of Science Resources Statistics Users Metadata Data User Variable names and formats Survey Manager Variable data types Subject Matter Expert Physical storage Statistician Tables and relationships Database Administrators Mapping of questions to Software Developers variables and definitions Logic flow of questions Response rates over time Paradata Cover letter 36

38 Process Data National Science Foundation Division of Science Resources Statistics Users Metadata Data User Item response rates Survey Manager Zero vs. null vs. missing Subject Matter Expert Edit specifications Statistician Imputation specifications Database Administrators Recode specifications Software Developers Data table specifications Changes across survey cycles 37

39 Data Dissemination and Publication National Science Foundation Division of Science Resources Statistics Users Metadata Data User History of changes Survey Manager Methodology report Subject Matter Expert Public use files with Statistician documentation Database Administrators Author/contact source Software Developers Who can access what Archivist Type of product Content format URL; Keywords Relationships Metadata schema 38

40 Who are the Metadata Users? National Science Foundation Division of Science Resources Statistics Data users  Basic & advanced Analysts  General public Respondent Survey Manager Survey Methodologist Statistician Subject Matter Expert Software Developer Database Administrator Archivist 39

41 Need for Standardization of Metadata is Apparent is Critical National Science Foundation Division of Science Resources Statistics 40

42 Standardization Efforts National Science Foundation Division of Science Resources Statistics Dublin Core SDMX (aggregate level) DDI 3.0 (record level) 41

43 Recent SRS Efforts National Science Foundation Division of Science Resources Statistics Data Repository (Oracle) Inclusion of some metadata SAS/ACCESS User Interface for internal users Evaluating external user interfaces 42

44 SRS Efforts -- Working with Commercial Contractors National Science Foundation Division of Science Resources Statistics Requirements for Data / Metadata delivery Examples document Standard contracting language Checklist 43

45 SRS Adopted Basic Operating Procedures National Science Foundation Division of Science Resources Statistics Using Oracle to store microdata and metadata Collecting metadata in whatever format Keeping it all organized 44

46 Challenges National Science Foundation Division of Science Resources Statistics Getting all the players on the same page  Many different users  Many different uses  Many different providers  Many different products  Many different formats Cost Keeping it all straight 45

47 Near Future Vision National Science Foundation Division of Science Resources Statistics SRS Data Repository Data and Metadata Taxonomy Efforts Data & Metadata Dissemination Analytic tools DDI 3.0, SDMX … 46

48 Near Future Vision National Science Foundation Division of Science Resources Statistics SRS Data Repository Data and Metadata Taxonomy Efforts Data & Metadata Dissemination Analytic tools DDI 3.0, SDMX … 47 Paradata

49 1984 National Science Foundation Division of Science Resources Statistics 48

50 Thank you! National Science Foundation Division of Science Resources Statistics 49


Download ppt "Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John."

Similar presentations


Ads by Google