Quantitative Evidence for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 26, 2009.

Quantitative Evidence for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 26, 2009

Outline  Distinction between statistics and data  Statistics are about definitions and classifications  Sources for population demographics  The Census  Sources for family expenditures and income  CANSIM and E-STAT  Sources for consumer behaviour  Tablebase, PMB and GMID

Distinguishing statistics from data

How statistics and data differ Statistics Numeric facts & figures Derived from data, i.e, already processed Presentation-ready Published Data Numeric files created and organized for analysis or processing Requires processing Not display-ready Disseminated, not published

Statistics are about definitions

Statistics are about definitions! Statistics are dependent on definitions. You may think of statistics as numbers, but the numbers represent measurements or observations based on specific definitions.definitions Tables are structured around geography, time and social content based on attributes of the unit of observation. These properties all need definitions.

Statistics are about definitions! Consider the following example from the 2006 Canadian Census on the data behind some statistics about visible minorities. Visible Minority Groups (15), Generation Status (4), Age Groups (9) and Sex (3) for the Population 15 Years and Over of Canada, Provinces, Territories, Census Metropolitan Areas and Census Agglomerations, 2006 Census - 20% Sample Data

Statistics are about definitions! How is visible minority status identified in the Census? Are aboriginals among the visible minority in Canada? What is the definition of visible minority?

Statistics involve classifications The definitions that shape statistics specify the metric of the data they summarize (for example, Canadian dollars) or the categories used to classify things if a statistic represents counts or frequencies. In this latter case, classification systems are used to identify categories of membership in a concept’s definition. Some classification systems are based on standards while others are based on convention or practice. For an example of a standard, see the North American Industrial Classification System (NAICS).NAICS

Statistics are presentation ready Tables and charts (or graphs) are typically used to display many statistics at once. You will find statistics sprinkled in text as part of a narrative describing some phenomenon; but tables and charts are the primary methods of organizing and presenting statistics.

A quick review To this point, we have established that:  Statistics are ‘real’ only if they are derived from data;  Statistics are dependent of definitions of the concepts they summarize;  Statistics that represent counts of things in the data employ classification systems, which are based either on standards or convention; and  Statistics are typically organized for display using tables or charts.

Statistics and data sources Population Demographics Family Expenditures and Income Consumer Behaviour

Population and demographics The Census is one of the most important sources of statistical information about Canada. It is the largest survey conducted in Canada and, consequently, is the primary source for small area statistics. To use data from the Census, you must know:  The characteristics collected in the Census that are available for the spatial units used to disseminate results;  The variety of spatial units used to disseminate Census results.

Census of Population Two forms are used to collect the Census: 2A, which goes to 80% of the households, and 2B, which goes to the other 20%. In 2006, the 2A form contained 8 questions while the 2B form had these 8 plus 53 additional questions. Long history of specific questions (see the Census Dictionary.)history of specific questions You need to understand the content of the Census to know what statistics are possible from the Census.

Urban small area statistics Census Metropolitan Areas Source for the graphic: Illustrated Glossary, 2006 Census Geography, Statistics Canada Metropolitan Areas 2006Map of Edmonton CMA

Census results for 2006 Standard Census data products  Highlight tables  Profiles  Census trends  Topic-based tabulations For smaller areas outside CMAs or for dissemination areas, need to retrieve from the Data Library Data Library Public use microdata files for individuals, households and families

A database product & a portal Before showing you products for income and family expenditures, you need to know about CANSIM and ESTAT.

CANSIM CANSIM is a very large database containing socio-economic statistics for Canada. There are currently over 38 million time series organized in approximately 2,800 tables. The statistics in CANSIM come from surveys (e.g., the Labour Force Survey), administrative data (e.g., crime and justice) and simulations or models (e.g., population projections).surveyscrime and justicepopulation projections Geography, content and time are basic to retrieving time series from CANSIM.

E-STAT E-STAT is a portal to retrieve Census results and free CANSIM holdings. The tables in this version of CANSIM are extracted once a year in July, while the online version CANSIM on the Statistics Canada website is updated daily. E-STAT If you access a table using CANSIM on the Statistics Canada website, you must pay \$3.00 per time series. The U of A also subscribes to the CHASS version of CANSIM, which is updated weekly. Like E-STAT, you don’t pay with this version.

Family expenditures and income

Census has individual and household income Income from administrative sources  T1 family file T1 family file  Longitudinal administrative database Longitudinal administrative database Survey sources for expenditure data  Survey of Household Spending Survey of Household Spending Public use microdata files for Survey of Household Spending

Consumer behaviour Tablebase contains statistics from the trade literature. Tablebase GMID - Global Market Information Database GMID PMB contains statistics about Canadian consumer demographics for specific product information. PMB Use keyword searches to find tables of interest and then conduct new searches employing the index terms assigned to them.

