3 Lecture SlidesLecture slides will be posted on the course website the day before classThe slides posted will NOT be complete (I want you to think during class, so won’t give you the answers to questions posed).You are encouraged to take notes on the slides.
4 by Lock, Lock, Lock Morgan, Lock, and Lock TextbookStatistics: Unlocking the Power of Databy Lock, Lock, Lock Morgan, Lock, and LockPurchasing options:Bookstore (new, used)wiley.com (e-book)Amazon.com (new, used, kindle, rent)Wiley Plus (wiley.com): interactive online text (linked videos, practice problems, odd solutions)
5 ClickersTwo options:1) Purchase an i>clicker remote (any version OK)2) Get the i>clickerGO app for your smartphoneRegister your clicker, using NetID, athttp://www.iclicker.com/support/registeryourclicker/The point of clicker questions is to motivate you to think actively about new material as it is being presentedCredit simply for clicking in
6 Class YearWhat is your class year?First-yearSophomoreJuniorSenior
7 MajorYour primary major (or potential future major) best falls under the category…Natural SciencesArts and HumanitiesSocial SciencesMath/Statistics/CSOther
8 Support My Office Hours: (in Old Chemistry 216) TA Office Hours: tbd Mon 3 – 4 pm, Wed 3 – 4 pm, Fri 1 – 3 pmTA Office Hours: tbdStatistics Education Center: (in Old Chem 211A)4 – 9 pm Sunday – Thursday in Old Chem 211Ayour TA or
9 Grade BreakdownLabs 26 points (5%) Homework 50 points (10%) Clickers 24 points (5%) Projects 100 points (20 %) Midterms 150 points (30%) Final Exam 150 points (30%) TOTAL 500 points (Up to 10 extra credit points may be earned.)
10 Labs Labs are on Thursdays in Old Chem 101, starting tomorrow Statistical software, practice analyzing dataLabs will be group basedLabs will use all free software:StatKey: lock5stat.com/statkeyOther free software: tbd
11 Homework Weekly homework due, usually on Mondays Point of homework: to LEARN!to make sure you are keeping up with the materialto prepare you for projects and examsGraded problems and practice problemsGradingGraded on a 6 point scaleLowest homework grade droppedPenalties for late homework
12 Projects Project 1 Project 2 Individual EDA, confidence intervals, hypothesis testswritten report up to 5 pages in lengthProject 2with your lab groupRegressionwritten report up to 10 pages in length
13 Exams Midterm Exams: 2/19 and 4/2 in class Final: 4/28 2 – 5pm Exams are mandatory and must be taken at the given time. Make-up exams will not be given.In extreme circumstances (severe illness), midterms may be excused only in advance. In this case the grade will be imputed by regression.
14 Keys to Success Come to class ready to think and be engaged Come to lab ready to think and be engagedDo the homework, trying it by yourself firstDo lots of practice problemsRead the textbookStay on top of the material
15 Introduction to Data SECTION 1.1 Data Cases and variables Categorical and quantitative variablesUsing data to answer a question
16 Why Statistics? Statistics is all about DATA Collecting DATADescribing DATA – summarizing, visualizingAnalyzing DATAData are everywhere! Regardless of your field, interests, lifestyle, etc., you will almost definitely have to make decisions based on data, or evaluate decisions someone else has made based on data
17 Data Data are a set of measurements taken on a set of individual units Usually data is stored and presented in a dataset, comprised of variables measured on cases
18 Cases and VariablesWe obtain information about cases or units. A variable is any characteristic that is recorded for each case.Generally each case makes up a row in a dataset, and each variable makes up a column
19 Countries of the World Country Land Area Population Rural Health InternetBirth RateLife ExpectancyHIVAfghanistan652230763.71.746.543.9Albania2740053.38.223.914.676.6Algeria34.810.610.220.872.40.1American Samoa200661077.7Andorra4708381011.121.370.510.4Angola18.104.22.168.9472Antigua and Barbuda4408663469.51175Argentina813.728.117.375.30.5Ask: What are the cases? What are the variables?
20 Intro Statistics Survey Data Ask: What are the cases? What are the variables?
21 Diet Coke and Calcium Drink Calcium Excreted Diet cola 50 62 48 55 58 6156Water46544553Ask: What are the cases? What are the variables?
22 Data US News and World Report National University Rankings Stock MarketDuke BasketballUnemployment RateHybrid CarsPublic OpinionAntidepressants and Alzheimer’s
23 Data Applicable to YouThink of a potential dataset (it doesn’t have to actually exist) that you would be interested in analyzingWhat are the cases?What are the variables?What interesting questions could it help you answer?Have students discuss, share answers
24 Counties with the highest kidney cancer death rates For fun, ask students to hypothesize about kidney cancer risk factorsCounties with the highest kidney cancer death ratesSource: Gelman et. al. Bayesian Data Anaylsis, CRC Press, 2004.
25 Counties with the lowest kidney cancer death rates Key: sample size. Smaller counties are more likely to have more extreme death rates – tell them they will learn about this.Counties with the lowest kidney cancer death ratesSource: Gelman et. al. Bayesian Data Anaylsis, CRC Press, 2004.
26 Kidney CancerIf the values in the kidney cancer dataset are rates of kidney cancer deaths, then what are the cases?The people living in the USThe counties of the USA person either has kidney cancer or doesn’t… a rate must apply to a group of people, such as a county
27 Kidney CancerIf the values in the kidney cancer dataset are yes/no, then what are the cases?The people living in the USThe counties of the USA person either has kidney cancer or doesn’t. Yes/no doesn’t make sense for a county.
28 Categorical versus Quantitative Variables are classified as either categorical or quantitative:A categorical variable divides the cases into groupsA quantitative variable measures a numerical quantity for each case
29 Categorical Quantitative Ask them to classify each as categorical or quantitative
30 Kidney CancerIf the cases in the kidney cancer dataset are counties, then the measured variable is…CategoricalQuantitativeRates are numbers (quantitative).
31 Kidney CancerIf the cases in the kidney cancer dataset are people, then the measured variable is…CategoricalQuantitativeEither having kidney cancer or not is categorical.
32 (the answer to all of these questions is yes!) VariablesFor each of the following situations:What are the variables?Is each variable categorical or quantitative?Can eating a yogurt a day cause you to lose weight?Do males find females more attractive if they wear red?Does louder music cause people to drink more beer?Are lions more likely to attack after a full moon?(the answer to all of these questions is yes!)
33 Summary Data are everywhere, and pertain to a wide variety of topics A dataset is usually comprised of variables measured on casesVariables are either categorical or quantitativeData can be used to provide information about essentially anything we are interested in and want to collect data on!
34 To Do Read Section 1.1 If you haven’t already… Get the textbook Get a clicker or app for your smartphone and register it at (for Student ID use your NetID)