Presentation is loading. Please wait.

Presentation is loading. Please wait.

Institute for Personal Robots in Education (IPRE)‏ CSC 170 Computing: Science and Creativity.

Similar presentations


Presentation on theme: "Institute for Personal Robots in Education (IPRE)‏ CSC 170 Computing: Science and Creativity."— Presentation transcript:

1 Institute for Personal Robots in Education (IPRE)‏ CSC 170 Computing: Science and Creativity

2 THE BIG DATA CHALLENGE How do we... process data sets that are too large for traditional algorithms and software tools? extract knowledge from large datasets in a wide variety of domains, from science to medicine to consumer data

3 DIKW PYRAMID Represents structural and/or functional relationships between d ata, i nformation, k nowledge, and w isdom Typically information is defined in terms of data, knowledge in terms of information, and wisdom in terms of knowledge wikipedia

4 DIKW PYRAMID Data : discrete raw facts, signals, observations of no use until...in a usable form Patient data, sensor data, scientific data

5 DIKW PYRAMID Data : discrete raw facts, signals, observations of no use until...in a usable form Patient data, sensor data, scientific data Information : processed, organized or structured data that is used for some purpose 14% of Americans are over 65

6 DIKW PYRAMID Data : discrete raw facts, signals, observations of no use until...in a usable form Patient data, sensor data, scientific data Information : processed, organized or structured data that is used for some purpose 14% of Americans are over 65 Knowledge and Wisdom: Thoughts in a persons mind believed to be true E = mc 2 God exists

7 DATA STORAGE HISTORY http://mashable.com/2011/10/08/digital-storage-infographic/

8 SORTING AND SEARCHING Traditional Searching & Sorting Algorithms Data must fit into main memory (RAM) Data are processed sequentially (not in parallel) http://www.sorting-algorithms.com/

9 ANALYZING DATA Spreadsheets Structured data: Columns and rows Small to large size data sets (50 MB - 2 GB) Analysis, visualization Widely used in business Code demo: Descriptive statistics in Google Sheets

10 VISUALIZING DATA Databases Structured Data: Tables Moderate to large data sets (2 GB - 2 TB) Storage and retrieval of relational data Logical searching and analysis of data Retrieve records for all accounts > 50000

11 STANDARD QUERY LANGUAGE SQL Data bases have tables: lists of records (1, 2,3) Record: a list of attributes (ID, Name, Address, City) Code Demo: Show a few data base queries from http://www.w3schools.com/sql/

12 DATASETS A data set (or dataset ) is a collection of data. Most commonly a data set corresponds to the contents of a single database table, or a single statistica data matrix, where every column of the table represents a particular variable, and eac row corresponds to a given member of the data set in question. The data set lists values for each of the variables, such as height and weight of an object, for each member of the data set. Each value is known as a datum. The data set may comprise data for one or more members, corresponding to the number of rows. wikipedia

13 GOOGLE PUBLIC DATA SETS Google Public Data This is the first part of Today’s lab Demo a bubble chart: http://www.google.com/publicdata/directoryhttp://www.google.com/publicdata/directory Determine something interesting


Download ppt "Institute for Personal Robots in Education (IPRE)‏ CSC 170 Computing: Science and Creativity."

Similar presentations


Ads by Google