Download presentation

Presentation is loading. Please wait.

Published byChristian Larsen Modified over 2 years ago

1
Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001

2
Outline An illustration Definitions Elements of the unit of analysis Complexity Data structure

3
An Illustration A group of students in an econometrics class were sent to the Data Library to find some data for an assignment.

4
An Illustration A typical request was like this one. I want to look at crime rates and a persons level of education.

5
An Illustration crime rates are usually associated with spatial units or a time series a persons education is an attribute of individuals This request raises problems.

6
An Illustration does the student want crime rates and the percentage of the population with certain education levels for specific cities? This would be data aggregated over geography. What are we looking for?

7
An Illustration does the student want the crime rate for one city over time, such as the number of homicides in Edmonton over the past 40 years. This would be data aggregated over time. What are we looking for?

8
An Illustration does the student want the education level of criminals? This would be a special subpopulation of individuals convicted of crimes and consist of a microdata file of criminals. What are we looking for?

9
An Illustration does the student want the education level of victims of crimes? This would be a special subpopulation of individuals who were victimized and consist of a microdata file of victims. What are we looking for?

10
An Illustration Looking at crime rates and level of education can differ depending upon the unit of analysis. individuals geographic areas changes over time

11
An Illustration After walking the student through these steps, he chose to build a model predicting income on the basis of highest educational attainment and a few other variables from the Census individual-level public use microdata file. He completely abandoned his interest in crime!

12
An Illustration Unfortunately, the students initial request not only failed to specify a clear unit of analysis, it included a mix of different units, which suggests that the concept was not understood.

13
The Point of the Illustration The unit of analysis is fundamental to the data reference interview. Early identification of the unit of analysis will help focus a search on statistics, aggregate data, or microdata.

14
The Point of the Illustration Furthermore, the unit of analysis is fundamental to secondary data analysis. It may be that knowledge of the unit of analysis is even more crucial in secondary analysis than in primary analysis, where the unit is implicit in the sample design, if not otherwise explicit.

15
The Point of the Illustration Finally, the unit of analysis is a fundamental characteristic of statistical data structures, which are the formal ways in which data are organized for processing.

16
Definitions The unit of analysis is the basic entity or object about which generalizations are to be made based on an analysis, and for which data have been collected

17
Definitions How does the unit of analysis relate to the unit of observation? The unit of observation is the entity in primary research that is observed and about which information is systematically collected.

18
Definitions The unit of observation and the unit of analysis are the same when the generalizations being made from a statistical analysis are attributed to the unit of observation.

19
Unit of Observation – in original data collections, the unit of observation is determined by the method by which observations are selected Unit of Analysis – the unit of analysis is determined by an interest in exploring or explaining a specific phenomenon Definitions

20
Identifying a Unit of Analysis As hinted in the earlier illustration, the unit of analysis is shaped by three attributes: – Social Phenomena – Time – Space

21
Research Outputs Lets begin by looking at a finished product to display these attributes. Well use a table from the Health Indicators Database about suicide.

22
Social Characteristics Geography and Time held constant

23
Ordered by Time Geography and Age held constant

24
Geography Emphasized Time and Age held constant

25
Social Phenomena observations of a single social entity, such as a person or an institution observations of multiple entities with a defined relationship, such as family, employer-employee

26
Social Phenomena transactional observations that are the result of actions among entities, such as labour strikes or international conflicts, including wars

27
Time observations made at one point in time; commonly referred to as a cross-sectional study

28
Time observations made at multiple points in time the data may be organized by time; commonly referred to as a time series time may structure some form of repeated measures of content or subjects

29
Space observations made within a specific spatial area observations made within a hierarchy of spatial areas

30
Complexity Complexity occurs when multiple types of entities are introduced within the same study. Examples parent child teacher person activity time person car trips

31
Complexity This complexity can arise within one of the attributes just discussed. – a study of parents, children, and teachers, which are all social units or between attributes – a study of people, their daily activities, and the length of time of each activity

32
Complexity Complexity is often represented in an hierarchy when the units can be grouped or nested within one another. For example, children may be grouped with their parents.

33
Complexity Children grouped (nested) with Parents. Parent 1Parent 2 Child 1Child 2Child 3

34
Complexity Parents and their children may be grouped into families and families grouped into households. Household 1 Family A Person i Person ii Household 2 Family A Person i Person ii

35
Complexity Complexity may also be represented by combinations of entities among units. Those entities that are associated with one another are combined and those that arent associated, arent combined.

36
Complexity These combinations are often described as having been crossed. For example, activities may be crossed with people.

37
Complexity Activities crossed with people. Activity 1Activity 2 Activity 4 Activity 3 Activity 5Activity 6 X = Person B Person A Person A Activity 3 Activity 6 Person B Activity 1 Activity 5

38
Complexity Up to this point, complexity has been described conceptually. Weve mentioned how multiple units of analysis and the ways in which they are related can create complexity.

39
Complexity Complexity also manifests itself structurally through the ways in which data are organized to represent the nesting or crossing of multiple units of analysis.

40
Thinking about Units of Analysis Conceptually – What is the content? This is what weve been reviewing up to this point. Structurally – How is it organized? This takes us to a discussion about data structure.

41
Lets review basic data structure. The unit of analysis defines the underlying structure of a data file. Statistical Data Structure

42
This structure consists of a series of rows with each row containing the data of one member of the unit of the unit of analysis. This simple structure is known as the flat, rectangular data matrix. Statistical Data Structure

43
Case 1 Case 2 Case 3 * Case n * * Case n-1 Statistical Data Structure

44
All of the information collected for each member of the unit of analysis is organized in a fixed location in the file called fields or variables. Statistical Data Structure

45
Case 1 Case 2 Case 3 * Case n * * Field 1 * Field2 Field 3 * Field k-1 Field k Case n-1 Statistical Data Structure

46
Case 1 Case 2 Case 3 * Case n * * Field 1 * Field2 Field 3 * Field k-1 Field k Case n-1 Statistical Data Structure

47
This structure looks like the grid of a spreadsheet. However, there is one very important difference between a statistical data structure and a spreadsheet. Statistical Data Structure

48
The spread sheet is organized around individual cells, while the statistical data structure is organized around the rows. Statistical Data Structure

49
Spreadsheet Statistical Data Structure

50
Cell B2 Cell E3 Cell C5 Cell F7 Spreadsheet Statistical Data Structure

51
Row 1 Row 3 Row k-1 Statistical Data Structure

52
The next slide presents the way that this simple statistical data structure appears in SPSS. Statistical Data Structure

53

54
Row 1

55
Row 8

56
Row 1 Row 8 Row 15

57
Row 1 Row 8 Row 15 Field 8

58
Person: GSS 10 Main

59
RECID WGHTFNL PROV DVSEX DVAGECAP

60
Adding Complexity to Data Structurally – hierarchical : order & different record layouts for different units of analysis – relational : 1 to n relations – compound records : combination of units represented on each record

61
Complex Data Structure Household 1 Person 1 Person 2 Household 2 Household 3 Person 1 Person 2 Person 3 Hierarchical Data Structure

62
RM RM RM T T T RM RM Geography: 1991 Census N9101 Population 15 years and over by age groups (17) and marital status (6a), showing labour force activity (8) and sex (3)

63
RM RM RM T T T RM RM PROVFED EACDCSD CSD Type CCS CMA/CA

64
RM RM RM T T T RM RM PROVFED EACDCSD CSD Type CCS CMA/CA

65
Complex Data Structure Relational Data Structure R1 R2 R3 R4 R5 R1 C1 R1 C2 R1 C3 R1 C4 R3 C1 R3 C2 R4 C1 R5 C1 R5 C2 One to Many

66
Person: GSS 10 Union

67
RECID UNIONTYP UNIONRNK

68
GSS 10 MainGSS 10 Union

69
GSS 10 MainGSS 10 Union

70
Complex Data Structure Compound Data Structure R1 x T1 x A1R1 x T2 x A4R1 x T3 x A7R1 x T4 x A3R1 x T4 x A1R2 x T1 x A2R2 x T2 x A9

71
GSS 2 Episode

72
SEQNUM DDAY NO_EPISO ACT_CODE

73
SEQNUM DDAY NO_EPISO ACT_CODE

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google