DLI Orientation: Concepts A Framework for Thinking about Statistical Information Train the Trainers Montreal, March 9, 2004 Chuck Humphrey Data Library.

Slides:



Advertisements
Similar presentations
DLI Orientation: Concepts
Advertisements

Archiving Trevor Croft MICS3 Data Archiving, Dissemination and Further Analysis Workshop Geneva - November 6th, 2006.
DLI & Research Data Centres Creating a better understanding of these two programs Chuck Humphrey Data Library University of Alberta April 2004.
Aggregate Data and Statistics
Elizabeth Hamilton Atlantic DLI Training April 29, 2005.
EQUINOX DATA DELIVERY SYSTEM May 31, 2011 –Elizabeth Hill Equinox.uwo.ca.
Statistics means never having to say you are certain Working with Remote Numbers E. Hamilton Atlantic DLI Training February 28, 2003.
The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
Environmental Statistics in E-STAT Tom Power Education Centre Library, Nipissing University/Canadore College Ontario DLI Training Guelph University, Guelph,
1 The DLI Contacts and Designates Survey: Ontario regional profile Gaëtan Drolet Train the Trainers February 23-25, 2010 Université de Montréal Montréal,
Data Access and Data Use: the Missing Link? Elizabeth Hamilton University of New Brunswick Chuck Humphrey University of Alberta Data and Knowledge Transfer.
Chuck Humphrey Data Library University of Alberta.
Meeting the Challenge The National Population Health Survey and Data Access E. Hamilton UNB Libraries IASSIST 2003.
Alternative Ways of Presenting Historical Census Data Luuk Schreven & Anouk de Rijk &
Anna Bombak, Chuck Humphrey, Lindsay Johnston and Leah Vanderjagt University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying.
Introducing Statistics and Data Geographic, Statistical and Government Information Centre, Susan Mowers.
Geo-referenced data and DLI aggregate data sources Chuck Humphrey University of Alberta September 29, 2008.
Quantitative Evidence for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 26, 2009.
Chuck Humphrey & Lynne Robinson University of Alberta Surviving Statistics Strategies for dealing with statistical questions on the reference desk.
Searching the University of Alberta Library’s Statistics Canada-based Websites 2001 Census of Canada Canadian Centre for Justice Statistics Canadian Business.
Quantitative Evidence for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library March 6, 2009.
Statistics and Data for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 27, 2008.
EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008.
Community Information Database (CID) Presented by: Carl Sauriol Rural Research and Analysis Rural and Co-operatives Secretariat.
The Data Liberation Initiative Orientation Session Statistics Canada / Statistique Canada University of Alberta December 5, 2001 Chuck Humphrey.
Geo-referenced data and DLI aggregate data sources Chuck Humphrey University of Alberta ACCOLEDS 2007.
Product Retrieval Statistics Canada / Statistique Canada Chuck Humphrey ACCOLEDS/DLI Training December, 2001.
NAICS? YIKES! (North American industry classification system (NAICS)? Yearly index of constant (k) dollar estimates (YIKES)!) Jeff Moon, Queens
The Crime Scene Justice Data and the Case of Multiple Files in GSS 18 Chuck Humphrey University of Alberta Atlantic DLI Workshop April 20-21, 2006.
Introduction to Statistical Literacy : A Low pain and high gain presentation Garth Homer, 02/11/09.
CANSIM A look at 3 interfaces Ontario DLI Training University of Guelph April 12, 2006 Suzette Giles Data, Map and GIS Librarian Ryerson University Library.
Whither or wither? Tracking and Sharing Survey Data: Findings from the Field E. Hamilton UNB Libraries Accoleds 2003.
Finding Data & GIS Files at the U of S Library Kiran Doranalli Lucy Li
Searching for Statistics Why can’t we find the data we need? Where should we even start?
Doing data & statistics at the reference desk (some of) what you’ll need to know OLA Super Conference Walter W. Giesbrecht Data Librarian,
Data and Social Research Chuck Humphrey Data Library Rutherford North Library.
Chuck Humphrey, University of Alberta Atlantic DLI Training, 2008 DLI Orientation: Concepts A Framework for Thinking about Data and Statistics.
DLI Workshop -- Mar Hosted by Dalhousie University March 2000 DLI Training Workshop.
NAICS? YIKES! Or North American industry classification system (NAICS)? Yearly index of constant (k) dollar estimates (YIKES)!
The Census of Canada and Immigration & Ethno-cultural Data Chuck Humphrey University of Alberta February 10, 2006.
POLS 328.3: Public Policy Analysis Finding data and statistics.
5 Marzo 2007 Census mapping and Gis Part II: dissemination Fabio Crescenzi Istat, Central Directorate on General Censuses UNECE Training Workshop on Census.
Framework of Statistical Information. This is a typology of the categories or classes of statistical information. Remember the relationship between statistics.
Innovations in Data Dissemination Thomas L. Mesenbourg, Jr. Acting Director U.S. Census Bureau United Nations Seminar on Innovations in Official Statistics.
ISR Training February 12,  Types of information you’ll find  Searching the website  Finding statistics using... ◦ Browse By Subject (Summary.
Soc : Principles of Research Design LONGITUDINAL DATA Sunny Kaniyathu, Data Services Librarian.
Creating Something from Nothing: Synthetic and Dummy files Bo Wandschneider University of Guelph Chuck Humphrey University of Alberta DLI Training: Ottawa,
United Nations Economic Commission for Europe Statistical Division The Importance of Databases in the Dissemination Process Steven Vale, UNECE.
DATABASE MANAGEMENT SYSTEMS CMAM301. Introduction to database management systems  What is Database?  What is Database Systems?  Types of Database.
Statistical data confidentiality and micro data in Albania
Creating Something from Nothing: Working with Synthetic Files ACCOLEDS /DLI Training: December 2003 Chuck Humphrey University of Alberta.
RRM : Resource Data and Environmental Modeling DATA SOURCES Sunny Kaniyathu, Data Services Librarian.
Sociology 343 Chuck Humphrey Data Library University of Alberta.
CTPP in TranStats The One-Stop Shop of Transportation Data
Hosted by the University of Regina Library December 1999 DLI Training Workshop Chuck Humphrey.
Soc 332.6: Principles of research design Finding statistics.
Rural Development Finding data and statistics.  Statistics Canada: Federal statistical agency  Data released under the Data Liberation Initiative (DLI)
Small Area Data and Geography For the 2017 DLI Training Workshop
Geo-referenced data and DLI aggregate data sources
Navigating Your Way Through the EFT, Nesstar and Beyond 20/20 (WDS)
Accessing data – a user’s perspective
Creating Something from Nothing: Working with Synthetic Files
DLI Orientation: Concepts
The Data Liberation Initiative Orientation Session
2001 Census of Population Products and Services Presentation to ACCOLEDS December 6, 2001.
University of Regina Library
The reference interview
Data Liberation Initiative (DLI)
Exploring the DLI Product line
Creating Something from Nothing: Working with Synthetic Files
Presentation transcript:

DLI Orientation: Concepts A Framework for Thinking about Statistical Information Train the Trainers Montreal, March 9, 2004 Chuck Humphrey Data Library University of Alberta February 2004 Wendy Watkins Carleton University

Statistical Information Two models for identifying and selecting appropriate statistical information: 1. A chart of statistical information Distinguishing statistics & data Distinguishing aggregate data & microdata

Statistical Information 2. Continuum of access Matching dissemination channels with desired products

Statistics or Data Statistics numeric facts/figures created from data, i.e, already processed presentation-ready Data numeric files created and organized for analysis requires processing not ready for display

Statistics or Data

Chart of Statistical Information

This is a typology of the categories or classes of statistical information. Remember the relationship between statistics and data, however, is causal. Statistics are created from data.

Chart of Statistical Information

An overlap occurs in this chart between Statistics: Databases and Data: Aggregate, which will be discussed below.

Chart of Statistical Information In print

In Print Rely on yearbooks, statistical abstracts, catalogues, and indexes to locate statistics in print. Examples of online indexes to print resources: Statistical Universe and Tablebase Example of an online catalogue to print resources: Statistics Canadas Online Catalogue

Chart of Statistical Information Online

Online Statistics Example of e-publications Statistics Canada Downloadable Publications (DSP) Example of e-tables Canadian Statistics (STC Website) Example of statistical databases CANSIM II (STC Website, E-STAT, CHASS)

E-Publications Tend to be available in PDF format Can use the Select Text Tool in the Adobe Reader and copy columns to another application

Statistical Information

E-Tables Tend to be displayed in HTML May provide a pull-down list to view other categories in the table Some e-tables will provide an alternate format for the table that can be downloaded (e.g., the Census tables are available in comma-separated ASCII, IVT, and print-friendly formats)

Databases Often use HTML forms to define the statistics to be retrieved May offer a variety of output formats for the retrieved statistics (e.g., E-STAT provides IVT format for Beyond 20/20, graphs, charts, maps, and ASCII formats for spreadsheets and databases)

Chart of Statistical Information Aggregate Data

Aggregate Data Aggregate data are statistics organized in databases or in data files. The data structure usually consists of tabulations structured by time, geography, or social content.

Aggregate Data Data Structure Time Geography Social Content Example: CANSIM II

Aggregate Data Time series data have long fueled econometric models. Comma-separate values (CSV) has become an important format for time series data, which is often manipulated in Excel if not analyzed in a spreadsheet.

Aggregate Data Data Structure Time Geography Social Content Example: CENSUS

Aggregate Data Increased access to GIS software has created greater demand for Census statistics. Beyond 20/20 has become a popular tool for reshaping census statistics from 1996 and 2001 for use with GIS software. DBF is the most commonly used format to share census statistics with GIS software.

Aggregate Data A map from E-STAT of Montreal Census Tracts

Aggregate Data Small area statistics are a special category of aggregate data. These data files consist of statistics for small geographic areas usually calculated from a population or manufacturing census or an administrative database with enough cases to create accurate summaries for small areas.

Aggregate Data Data Structure Time Geography Social Content Example: Cause of Death (HID)

Aggregate Data Also known as cross-classified data, these files tend to consist of tables constructed from social content variables. Examples of cross-classified tables in DLI are found in education and justice.

Chart of Statistical Information Microdata

Raw data organized in a file where the lines in the file represent a specific unit of observation and the information on the lines are the values of variables.

Confidential Microdata Master files: these files contain the fullness of detail captured about each case of the unit of observation. This detail is specific enough that the identify of a case can often be easily disclosed. Therefore, these files are treated as confidential.

Confidential Microdata Share files: these are confidential files in which the cases have signed a consent form permitting Statistics Canada to allow access to their information for approved research.

Public Use Microdata These microdata are specially prepared to minimize the possibility of disclosing or identifying any of the cases in a file. The original data from the master file are edited to create a public use microdata file.

Public Use Microdata Steps in Anonymizing Microdata Remove of all personal identification information (names, addresses, etc); Include only gross levels of geography; Collapse detailed information into a smaller number of general categories; Cap the upper range of values of variables with rare cases; Suppress the values of a variable; or Suppress entire cases.

Public Use Microdata Statistics Canada PUMFs Only available for select social surveys that undergo a review of the Data Release Committee, an internal Statistics Canada committee. No enterprise public use microdata

Public Use Microdata Statistics Canada PUMFs Almost all are cross-sectional, that is, represent data collected at one point in time. Longitudinal data are difficult to anonymize and maintain any useful information.

Summary: First Model

This first model provides a way of thinking about the types of statistical information that exist. Is the information Statistics or Data? If Statistics, is the information in print or online? If online, is it in an e-pub, e-table, or database? If Data, is the information aggregate data or microdata?

The Second Model It is one thing to know the variety of statistical information that exists, but access to this information is quite a separate issue. The second model describes the various dissemination channels through which access is provided to statistical information by Statistics Canada.

Continuum of Access Statistics Canada provides access to its statistical information through a variety of services and initiatives that function as dissemination channels. Think of this variety as constituting a continuum along which levels of access are provided.

Continuum of Access There are three characteristics that make up this continuum: Cost : which runs from free to expensive; Restrictions or conditions : which runs from open or no restrictions to very restricted; and Type of Information : which runs from statistics to data.

CANSIM II and Trade Analyzer Services available for selected titles. Remote job submission is the most developed for NPHS. Applications can now be submitted through the SSHRC Web site. ACCESS Open Free Statistics Restricted Expensive Data

Using the Two Models Combining these two models should assist you in identifying and selecting appropriate statistical information. The types of statistical information should help you identify an appropriate product, while the continuum of access should help you locate the channel or channels through which the statistical information is disseminated.

Warning Remember that while Statistics Canada is an important source of statistical information in our country, it is not the only source. Other important sources include other federal government and provincial departments, data libraries and archives, non- & inter-governmental agencies, and commercial vendors.