Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Reference (the very, very basics) Data-reference: what do we need? Tools Strategies Terminology Understanding of what we are looking for: not.

Similar presentations


Presentation on theme: "Data Reference (the very, very basics) Data-reference: what do we need? Tools Strategies Terminology Understanding of what we are looking for: not."— Presentation transcript:

1

2 Data Reference (the very, very basics)

3

4 Data-reference: what do we need? Tools Strategies Terminology Understanding of what we are looking for: not books or articles -- or facts.

5 Data-reference: what do we need? Understanding of what we are looking for: not books or articles -- or facts. Terminology Strategies Tools

6

7 La trahison des images, The treachery of images, Rene Magritte

8 Ceci n’est pas les “data.” C’est les statistiques!

9 Raw (for analysis)Cooked (facts) Intended for use by computer For human use: Eye-readable, charts, tables, graphs Collected based on social science methodologies or administrative procedures Produced from data Computer- readable Can be print, micro, computer readable DataStatistics

10 Data

11 Statistics

12 Where do statistical babies come from? + =

13 Data or Statistics: Why does it matter? Different search strategies and tools. Defines your goal. Helps you know when you've found it!

14 Tip: Data or Statistics? Determine if the user wants (needs) statistics or data. – Do you want want one number? – Are you looking for a fact or figure? – Do you want to know “how many?”

15 Tip: Data or Statistics? Determine if the user wants (needs) statistics or data. – Or… do you want a series of numbers? – Do you want to identify trends, make comparisons, model relationships? – Will you be using statistical software (not Excel)?

16

17 http://factfinder.census.gov /

18 http://www.census.gov/compendia/statab/elections/election.pdf

19 http://www.census.gov/compendia/statab/tables/06s0405.xls

20 ftp://ftp.bls.gov/pub/special.requests/lf/aat44.txt

21 http://www.bls.gov/webapps/legacy/cpsatab7.htm

22

23

24 From survey to data to statistics… Survey instrument Q1. [enter zip code ] Q2. [enter R’s first name ] Q3. [enter sex of R ] Q4. What was your major in College? Q5. What was your income last year? Q6. Did you go to church last week?

25 Answers to Questions Zip Name Sex Major income church 29002 Wilma F lit 0 y 99005 Barney M engin 10 n 99005 Betty F. 0 n 92005 Ethel F theater 1000 y 12534 Fred M. M PE 10000 y 12534 Lucy F lit 700 y 25000 Ricky M music 11000 y 20000 Fred A. M dance 10500 n 15000 Ginger F math 9500 y

26 Must anonymize the data! Zip Name Sex Major income church 29002 Wilma F lit 0 y 99005 Barney M engin 10 n 99005 Betty F. 0 n 92005 Ethel F theater 1000 y 12534 Fred M. M PE 10000 y 12534 Lucy F lit 700 y 25000 Ricky M music 11000 y 20000 Fred A. M dance 10500 n 15000 Ginger F math 9500 y

27 Zip Name Sex Major income church 29002 001 F lit 0 y 99005 002 M engin 10 n 99005 003 F. 0 n 92005 004 F theater 1000 y 12534 005 M PE 10000 y 12534 006 F lit 700 y 25000 007 M music 11000 y 20000 008 M dance 10500 n 15000 009 F math 9500 y Must anonymize the data!

28 Change Text to Numeric Codes Zip Name Sex Major income church 29002 001 F lit 0 y 99005 002 M engin 10 n 99005 003 F. 0 n 92005 004 F theater 1000 y 12534 005 M PE 10000 y 12534 006 F lit 700 y 25000 007 M music 11000 y 20000 008 M dance 10500 n 15000 009 F math 9500 y

29 Zip Name Sex Major income church 29002 001 1 lit 0 y 99005 002 2 engin 10 n 99005 003 1. 0 n 92005 004 1 theater 1000 y 12534 005 2 PE 10000 y 12534 006 1 lit 700 y 25000 007 2 music 11000 y 20000 008 2 dance 10500 n 15000 009 1 math 9500 y Change Text to Numeric Codes

30 Zip Name Sex Major income church 29002 001 1 lit 0 y 99005 002 2 engin 10 n 99005 003 1. 0 n 92005 004 1 theater 1000 y 12534 005 2 PE 10000 y 12534 006 1 lit 700 y 25000 007 2 music 11000 y 20000 008 2 dance 10500 n 15000 009 1 math 9500 y The “codebook” must document the numeric codes used! For example: Variable: “sex” 1 = female 2 = male Change Text to Numeric Codes

31 Zip Name Sex Major income church 29002 001 1 0075 0 y 99005 002 2 0070 10 n 99005 003 1. 0 n 92005 004 1 0076 1000 y 12534 005 2 0001 10000 y 12534 006 1 0075 700 y 25000 007 2 0077 11000 y 20000 008 2 0078 10500 n 15000 009 1 0050 9500 y Change Text to Numeric Codes

32 Zip Name Sex Major income church 29002 001 1 0075 0 1 99005 002 2 0070 10 2 99005 003 1. 0 2 92005 004 1 0076 1000 1 12534 005 2 0001 10000 1 12534 006 1 0075 700 1 25000 007 2 0077 11000 1 20000 008 2 0078 10500 2 15000 009 1 0050 9500 1 Change Text to Numeric Codes

33 Zip Name Sex Major income church 29002 001 1 lit 0 y 99005 002 2 engin 10 n 99005 003 1. 0 n 92005 004 1 theater 1000 y 12534 005 2 PE 10000 y 12534 006 1 lit 700 y 25000 007 2 music 11000 y 20000 008 2 dance 10500 n 15000 009 1 math 9500 y Change Text to Numeric Codes

34 Zip Name Sex Major income church 29002 001 1 0075 0 y 99005 002 2 engin 10 n 99005 003 1. 0 n 92005 004 1 theater 1000 y 12534 005 2 PE 10000 y 12534 006 1 0075 700 y 25000 007 2 music 11000 y 20000 008 2 dance 10500 n 15000 009 1 math 9500 y Change Text to Numeric Codes

35 Zip Name Sex Major income church 29002 001 1 0075 0 y 99005 002 2 0070 10 n 99005 003 1. 0 n 92005 004 1 0076 1000 y 12534 005 2 0001 10000 y 12534 006 1 0075 700 y 25000 007 2 0077 11000 y 20000 008 2 0078 10500 n 15000 009 1 0050 9500 y Change Text to Numeric Codes

36 Zip Name Sex Major income church 29002 001 1 0075 0 1 99005 002 2 0070 10 2 99005 003 1. 0 2 92005 004 1 0076 1000 1 12534 005 2 0001 10000 1 12534 006 1 0075 700 1 25000 007 2 0077 11000 1 20000 008 2 0078 10500 2 15000 009 1 0050 9500 1 Sometimes, even numeric variables are encoded in ranges. For example: Variable: “income” 1 = less than 1000 2 = 1000 - 4999 3 = 5000 - 10000 4 = more than 10000 9 = not reported Change Text to Numeric Codes

37 Zip Name Sex Major income church 29002 001 1 0075 1 1 99005 002 2 0070 1 2 99005 003 1. 1 2 92005 004 1 0076 2 1 12534 005 2 0001 3 1 12534 006 1 0075 1 1 25000 007 2 0077 4 1 20000 008 2 0078 4 2 15000 009 1 0050 3 1 Sometimes, even numeric variables are encoded in ranges. For example: Variable: “income” 1 = less than 1000 2 = 1000 - 4999 3 = 5000 - 10000 4 = more than 10000 9 = not reported Change Text to Numeric Codes

38 Data Files do not need “headers” Zip Name Sex Major income church 29002 001 1 0075 1 1 99005 002 2 0070 1 2 99005 003 1. 1 2 92005 004 1 0076 2 1 12534 005 2 0001 3 1 12534 006 1 0075 1 1 25000 007 2 0077 4 1 20000 008 2 0078 4 2 15000 009 1 0050 3 1

39 29002 001 1 0075 1 1 99005 002 2 0070 1 2 99005 003 1. 1 2 92005 004 1 0076 2 1 12534 005 2 0001 3 1 12534 006 1 0075 1 1 25000 007 2 0077 4 1 20000 008 2 0078 4 2 15000 009 1 0050 3 1 Data Files do not need “headers”

40 Data Files do not need extra space 29002 001 1 0075 1 1 99005 002 2 0070 1 2 99005 003 1. 1 2 92005 004 1 0076 2 1 12534 005 2 0001 3 1 12534 006 1 0075 1 1 25000 007 2 0077 4 1 20000 008 2 0078 4 2 15000 009 1 0050 3 1

41 290020011 0075 1 1 990050022 0070 1 2 990050031. 1 2 920050041 0076 2 1 125340052 0001 3 1 125340061 0075 1 1 250000072 0077 4 1 200000082 0078 4 2 150000091 0050 3 1 Data Files do not need extra space

42 2900200110075 1 1 9900500220070 1 2 990050031. 1 2 9200500410076 2 1 1253400520001 3 1 1253400610075 1 1 2500000720077 4 1 2000000820078 4 2 1500000910050 3 1 Data Files do not need extra space

43 29002001100751 1 99005002200701 2 990050031. 1 2 92005004100762 1 12534005200013 1 12534006100751 1 25000007200774 1 20000008200784 2 15000009100503 1 Data Files do not need extra space

44 290020011007511 990050022007012 990050031. 12 920050041007621 125340052000131 125340061007511 250000072007741 200000082007842 150000091005031 Data Files do not need extra space

45 Codebook must document locations 290020011007511 990050022007012 990050031. 12 920050041007621 125340052000131 125340061007511 250000072007741 200000082007842 150000091005031 For example: Variable: “sex” location: column 9 width: 1

46 290020011007511 990050022007012 990050031. 12 920050041007621 125340052000131 125340061007511 250000072007741 200000082007842 150000091005031 For example: Variable: “sex” location: column 9 width: 1 123456789 Codebook must document locations

47 Codebook documents question, location, codes. 290020011007511 990050022007012 990050031. 12 920050041007621 125340052000131 125340061007511 250000072007741 200000082007842 150000091005031 For example: Q3. [enter sex of R ] Variable: “sex” location: column 9 width: 1 Variable: “sex” 1 = female 2 = male

48 To Use Data You Need 3 Things Data: the datafile (the raw numbers) Metadata: the “codebook” (where the numbers are and what they mean) Statistical Software (for reading the datafile and analyzing the data)

49 Statistical software Codebook Data Q3. [enter sex of R ] Variable: “sex” location: column 9 width: 1 Variable: “sex” 1 = female 2 = male 90020011007511 990050022007012 990050031. 12 920050041007621 125340052000131 125340061007511 250000072007741 200000082007842 150000091005031 + +

50 SPSS commands SPSS reads the program 90020011007511 990050022007012 990050031. 12 920050041007621 125340052000131 125340061007511 250000072007741 200000082007842 150000091005031 Student writes SPSS program to analyze data… SPSS reads the data. And produces charts, tables, analysis, etc.

51 Female 49 years old

52 Codebook entry for variable PRES92 Question text Responses

53 Codebook entry for variable DEGREE Question text Responses

54 Voted for Clinton Junior college Female 49 years old

55

56 Degree Pres92

57

58

59 Tip: "variables" contain the essential, important content of data files

60 Tip: Data-reference is not about searching for an answer… Data reference is often less about searching to find an answer. (That's a statistical reference question.) Data reference is often more about exploring to find data that will enable users to ask a question.

61 What have we learned? Data and statistics are not the same Data reference leads to primary research material, not facts or statistics. To use data, a user must have data, metadata, and statistical software. A-and…

62 What have we learned? "Variables" are what contain critical, important content of data files. And that means that the gold-standard of data- reference is variable-level searching.

63

64 http://gort.ucsd.edu/calpol/

65 Question Text (Variable 34) Study of July 2003

66


Download ppt "Data Reference (the very, very basics) Data-reference: what do we need? Tools Strategies Terminology Understanding of what we are looking for: not."

Similar presentations


Ads by Google