Presentation is loading. Please wait.

# The W’s of Data. Data  Does have to be numbers?  It can be doesn’t have to be.  Without context, it’s useless!  Consider 17, 21, 44, and 76  Are.

## Presentation on theme: "The W’s of Data. Data  Does have to be numbers?  It can be doesn’t have to be.  Without context, it’s useless!  Consider 17, 21, 44, and 76  Are."— Presentation transcript:

The W’s of Data

Data  Does have to be numbers?  It can be doesn’t have to be.  Without context, it’s useless!  Consider 17, 21, 44, and 76  Are those data?

Data Handout

The Five W’s of Data  Answering the Five W’s of Data provide the context of the data.  Who  What  When  Where  Why  And if possible How

Who  Rows of data correspond to individual cases about whom (or which if not people) we record some characteristics  Respondents – individuals who answer a survey  Subjects or participants – people on whom we experiment  Experimental units – inanimate subjects for experiments  Data values may also be called observations without being clear about the Who

From the data sheet  Who?

What  Variables – the characteristics recorded about each individual  Variables are usually recorded in the columns of a data table  Variables identify What has been measured  They may seem simple but think!  Variables have measurement units – it’s natural to count how many cases belong in each category.  The units tell how each value has been measured (scale)

Variables  Categorical variables – name categories and answers how cases fall into these categories. Can also be a qualitative variable  Ex. Gender, Year in school, nationality, etc.  Quantitative variable – answers a question about the quantity of what is measured  Ex. Height, weight, income, etc.  Just because the data are numbers does not make it quantitative  Ex. Zip codes

From the data sheet  What?

Why  It’s the questions we ask a variable that shape how we think about it.  Ex. An end of class survey asks “How valuable do you think this course will be to you?”  1 = worthless2 = slightly3 = middling  4 = reasonably5 = invaluable  Is the educational value categorical or quantitative?

From the data sheet  Are variables qualitative or quantitative?  Why?

Counts count  When Amazon offers free shipping, they might first analyze how purchases are shipped.  Counting summarizes the categorical variable, shipping method.  We also use counts to measure quantities such as the number of classes you are taking or how many songs you own.  Two ways to use counts:  Count the cases in each category of a categorical variable, the category label are the What and the individuals counted are the Who  The counts themselves are not data, but they are something to summarize about the data

Example  Back to Amazon’s shipping  What is the categorical variable?  What?  Who?  Why? Shipping MethodNo. of purchases Ground20,345 Second-day7,890 Overnight5,432

 The second way is when the focus is on the number of something, which is measured by counting.  Ex. Amazon might track the growth in the number of teenage customers each month to forecast CD sales.  What?  Who?  Units?  Why?  Is teen a category? Is it a quantitative variable? MonthNo. of Teenage Customers January123,456 February234,567 March345,678 April456,789

Identifiers  Is your student ID number a quantitative variable?  Why?  Other examples of identifiers include UPS tracking numbers, social security numbers, driver’s license numbers  Identifier variables do not tell us anything useful about the category because there is exactly one individual in each.  The are used to:  Combine data from different sources  Protect confidentiality  Provide unique labels

 We must know Who, What, and Why to analyze but understand more we would also like to know When, Where, and How.  When can make a difference in the data.  Example Number of women with jobs outside the home in 1900 and the number of women with jobs outside the home in 2000.  Where can make a difference in the data  Example Number of high school students participating in ice hockey in Florida and Number participating in ice hockey in Minnesota We need more information…

 How data is collected matters  Survey, interviews, observation, etc.  How could surveys be flawed, especially internet surveys?

Example  Medical researchers at a large city hospital investigated the impact of prenatal care on newborn health collected data from 882 births during 1998-2000. They kept track of the mother’s age, the number of weeks the pregnancy lasted, the type of birth (cesarean, induced, natural), the level of prenatal care the mother had (none, minimal, adequate), the birth weight and sex of the baby, and whether the baby exhibited health problems (none, minor, major).  Identify the W’s, name the variables, specify for each variable whether its use indicates it should be treated as categorical or quantitative, identify the units in which it was measured or note that they were not provided.

 Homework p. 16 2-12 even

Download ppt "The W’s of Data. Data  Does have to be numbers?  It can be doesn’t have to be.  Without context, it’s useless!  Consider 17, 21, 44, and 76  Are."

Similar presentations

Ads by Google