Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Data Management and Relational Databases.

Similar presentations


Presentation on theme: "Introduction to Data Management and Relational Databases."— Presentation transcript:

1 Introduction to Data Management and Relational Databases

2 Outline of presentation Data Management Data Management Compare database Compare database Versus spreadsheets, word processor docs, … Versus spreadsheets, word processor docs, … Relational Databases Relational Databases Parts & Terms Parts & Terms tables, forms, queries, reports (we’ll skip reports) tables, forms, queries, reports (we’ll skip reports) fields, records, keys fields, records, keys Relationships Relationships Linking Tables Linking Tables

3 2 1 3 21 3 2 1 3 2 1 1 3 2 1 3 2 1 32 1 3 3 2 1 3 2 C B A C B A C B A Clear Cut Mature Forest Old Growth

4

5

6 Data Management Issues Organization! Organization! Data Entry (error-prone process) Data Entry (error-prone process) Quality Control – Quality Assurance Quality Control – Quality Assurance Metadata (possible data values, how collected, etc.) Metadata (possible data values, how collected, etc.) Tracking specimens, samples Tracking specimens, samples Data retrieval Data retrieval

7 Spreadsheet vs. Relational Databases Relational Database Relational Database Data entry Data entry Data storage Data storage Data retrieval Data retrieval Spreadsheet Spreadsheet Manipulating Data (eg. Pivot tables) Manipulating Data (eg. Pivot tables) Summarizing & Presenting Data (eg. graphing) Summarizing & Presenting Data (eg. graphing) (Formatting data for statistics programs) (Formatting data for statistics programs)

8 ?

9 Embedded Information Spreadsheets = “flat files” Databases = “multi-dimensional”

10

11

12 A look at spreadsheets

13 It is possible to sort and filter records in the spreadsheet (look under DATA in the menu bar). Filtering temporarily removes all unwanted records from view. This is also possible in a database, with some more sophisticated options available.

14 It is very easy to copy cells or entire rows of data in spreadsheets, but more difficult in databases (one of the few advantages of spreadsheets over databases). However, if one needs to copy-down a lot of data, then the database is not well “normalized” (discussed later).

15 It is easy to search for and replace words in spreadsheets. This is also possible in databases, but with more sophisticated search and replace options.

16 The “auto-fill” option in a spreadsheet completes a word it recognizes from entries immediately above the current one. In databases one can use a “lookup table” (discussed later) for a full list of values (eg. names), which might not yet appear in the data set.

17 Word processor files are the least capable of filtering, finding & replacing, and assisting data entry compared with spread sheets and databases.

18 Although with proper formatting a word processor document can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.

19 Archiving Data As an aside, the best form to archive data in is tab delimited (.txt) or comma separated values (.csv) text files As an aside, the best form to archive data in is tab delimited (.txt) or comma separated values (.csv) text files Although programs and formats come and go, all database, spreadsheet, and word processor programs know how to handle.txt and.csv files Although programs and formats come and go, all database, spreadsheet, and word processor programs know how to handle.txt and.csv files

20 Comparing databases, spreadsheets, and documents DatabaseSpreadsheetDocument SortYesYesYes FilterYesYesNo ReplaceYesYesYes Fill down No 3 YesNo Auto-complete Yes 1 No Whole row YesYesYes Link tables Yes No 2 No 1 Auto complete is done in very different ways 2 Not linking in a true relational sense, except through a database 3 Properly set up (normalized) data can be back-filled

21 Tabulations Matrix-style synopsis of data Matrix-style synopsis of data “crosstab query” in MS Access “crosstab query” in MS Access “pivot table” in MS Excel “pivot table” in MS Excel

22 ?

23 Relational Databases Four major components Four major components Tables – these are where ALL data reside Tables – these are where ALL data reside Queries – select subsets of data (retrieve data) Queries – select subsets of data (retrieve data) Forms – “windows” into data tables (views of data) Forms – “windows” into data tables (views of data) Reports – summaries of data (formatted synopses) Reports – summaries of data (formatted synopses)

24 Tables All data in relational databases reside in tables. Queries, forms, and reports are just convenient ways of looking at the data in the tables. As we shall soon see, the sizes and types of data that can be entered into a table can be regulated for better efficiency and error-proofing. And two or more tables that have a field in common can be linked to draw information from all related tables.

25 Some terminology: Each square is a “cell” of data

26 Columns = “Fields”

27 Rows = “Records”

28 To change between datasheet view and design view

29 Another way to enter design view is to click on the table name once (so that it is highlighted), then click on the design view icon. Or right-click the table name and choose design view.

30 The DESIGN VIEW of a table is where one dictates the type and range of data that can be entered into each field. This can include formatting (such as capitalization), default values, and valid/non-valid entries.

31

32 “Lookup” is used to create a list of possible values that a field can take. This example uses a list of values in the field’s properties settings (in DESIGN VIEW). In DATA VIEW the field will have a drop down list of values (“Combo Box”). The full value will be filled in when the first letter is typed.

33

34 In this example the “lookup” is set to the list of species codes in the table “Species”

35

36 Miscellaneous New records are always added at the end of the table (many people find this annoying) New records are always added at the end of the table (many people find this annoying) “Esc” once to undo current typing “Esc” once to undo current typing “Esc” twice to undo the whole record “Esc” twice to undo the whole record Changes are saved when you move off the cell Changes are saved when you move off the cell No need to save the data in a database after any changes (formatting changes must be saved) No need to save the data in a database after any changes (formatting changes must be saved)

37 ?

38 Linking Tables Fields common b/w two or more tables can link Fields common b/w two or more tables can link Keyed fields prevent duplicate entries Keyed fields prevent duplicate entries Keyed fields determine relationships b/w tables Keyed fields determine relationships b/w tables Linked tables can reduce data entry and storage needs (using an idea called data normalization) Linked tables can reduce data entry and storage needs (using an idea called data normalization)

39

40

41

42

43

44 Keys and relationships A keyed field is one that does not allow repeated values. A keyed field is one that does not allow repeated values. For example, if the field “Code Name” is keyed in a table, then the user would not be allowed to enter the same Code Name more than once (an error, “key violation” would appear). In this way, one constructs a list of unique values (eg. Code Names). For example, if the field “Code Name” is keyed in a table, then the user would not be allowed to enter the same Code Name more than once (an error, “key violation” would appear). In this way, one constructs a list of unique values (eg. Code Names).

45

46 ?

47 One-to-many relationship Because each Species Code is unique in the “keyed” Species table, and can be repeated many times in the Collections table, a “one- to-many” relationship is created between the two (indicated by the “1” and infinity). Referential integrity means that a Species Code cannot be entered into the Collections table if it is not in the Species table. Cascade Update allows one to change the species code once and propagate that change through Collections table. Cascade Delete deletes that species code in all tables that are connected. Use this feature cautiously. This is the “relationships view” of the database that allows the user to define which tables are linked and how. Keyed fields are in bold.

48

49 One-to-one relationship?

50

51 Normalization Identical data should not be repeated Identical data should not be repeated Fields with records that are repeated should be pulled out into a separate table and referenced when needed Fields with records that are repeated should be pulled out into a separate table and referenced when needed

52

53

54

55

56 ?

57 Forms “Windows into the tables” (which really contain the data). Forms allow the user to customize/arrange fields on the screen for ease of data entry and retrieval. Forms also allow the user to view several linked tables in one display. Forms display all the records in the underlying tables as opposed to subsets (see queries for this). Data can be entered and updated in forms.

58 Forms can (should) look like field collection sheets

59 Forms can combine several tables in one view (this one has five)

60 A form with a subform (ex: Loans Masterform) Here two tables are combined in one display subform

61 Using Forms: Create data entry forms that look like field sheets Create data entry forms that look like field sheets Viewing multiple related tables in one view Viewing multiple related tables in one view The easiest way to create forms in MS Access is to use the “form wizard” The easiest way to create forms in MS Access is to use the “form wizard” Select first table to be in form Select first table to be in form Select fields to be in form Select fields to be in form Select next table to be in form Select next table to be in form Select fields to be in form … Select fields to be in form … Select style of query to create Select style of query to create Once the form is created, go to Design View to modify the size and position of fields in the display Once the form is created, go to Design View to modify the size and position of fields in the display

62 ?

63 Queries Queries are used to ask questions of the data. That is, to select subsets of records based on particular search criteria. The real power of relational database queries is that data can be summarized across tables that are linked by common fields.

64

65

66

67

68 Queries Used to select subsets of data or combinations of data from linked tables Used to select subsets of data or combinations of data from linked tables Two or more tables may be combined in one query using linked fields Two or more tables may be combined in one query using linked fields Data can be entered and changed (changes will propagate through the underlying tables) Data can be entered and changed (changes will propagate through the underlying tables) Queries can also be used to append one table to another, update records given certain criteria, or make new tables for exporting as spreadsheets or documents Queries can also be used to append one table to another, update records given certain criteria, or make new tables for exporting as spreadsheets or documents

69 Creating Queries Queries are easily created in Design View Queries are easily created in Design View Click “Create query in design view” Click “Create query in design view” Select tables to be in the query Select tables to be in the query Select the fields to be in the query Select the fields to be in the query Type in any criteria to search for in each field Type in any criteria to search for in each field Set any sorting preferences Set any sorting preferences Press run button (!) Press run button (!)

70 Review Steps for creating a relational database Steps for creating a relational database

71 2 1 3 21 3 2 1 3 2 1 1 3 2 1 3 2 1 32 1 3 3 2 1 3 2 C B A C B A C B A Clear Cut Mature Old Growth Step 1: Nested Data Structure

72 Step 2: Sketch-out tables Table 1 Table 1 Trap Trap Plot Plot Site Site Date Date Species Species Genus Genus Family Family Order Order Quantity Quantity Collector Collector Table 2 Table 2 Dissolved Oxygen Dissolved Oxygen Dissolved Nitrogen Dissolved Nitrogen pH pH Temperature Temperature Plot Plot Site Site Collector Collector

73 Step 3: Data Normalization Look for fields with repeated records Look for fields with repeated records Consider pulling these fields out into a separate table that links back to the first table Consider pulling these fields out into a separate table that links back to the first table Set up look up tables or look up values Set up look up tables or look up values Designate Keyed fields Designate Keyed fields These are fields with non-repeated values These are fields with non-repeated values Keys can be sequential numbers, or unique codes Keys can be sequential numbers, or unique codes

74 Step 4: Establish one-to-many links

75 Step 5: Set up forms and queries Set up forms for easier data entry Set up forms for easier data entry Set up queries to retrieve specific data sets Set up queries to retrieve specific data sets

76 ?

77 http://nature.berkeley.edu/~poboyski/download/


Download ppt "Introduction to Data Management and Relational Databases."

Similar presentations


Ads by Google