Presentation on theme: "Gold Rush Mining Public Library Data with R and Excel."— Presentation transcript:
Gold Rush Mining Public Library Data with R and Excel
3 free online data sources We used IMLS
The friendliest are CSV (comma sep) and XLS We downloaded a CSV
Choose your tools wisely.
Get a preliminary overview of your data Cleaning
Is a powerful tool for statistical analysis.
View(), Fix() Measures of central tendency Other calculations: sum() columns to find out which locale has highest number of internet computers, relative to registered borrowers, and reference transactions
From : Which libraries have the highest percentage of hourly employed ALA MLS librarians? How many libraries have no ALA MLS librarians?
Identify relevant columns Total staff hours ALA MLS hours Do calculations %of total staff hours are ALA MLS
Stages of Analysis. Explore by basic stats Minimum % of staff hours Maximum % of total staff hours Average % of total staff hours
Identify interesting data points Leads to more specific questions: Type of area? Name of library?
Beginner coder? We are Librarians. Use tools you know.
Convert to CSV or other friendly format. For analysis: -Excel -Weka -SPSS