Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fluency with Information Technology INFO100 and CSE100 Katherine Deibel 2012-02-29Katherine Deibel, Fluency in Information Technology1.

Similar presentations


Presentation on theme: "Fluency with Information Technology INFO100 and CSE100 Katherine Deibel 2012-02-29Katherine Deibel, Fluency in Information Technology1."— Presentation transcript:

1 Fluency with Information Technology INFO100 and CSE100 Katherine Deibel 2012-02-29Katherine Deibel, Fluency in Information Technology1

2  We learn about data management  We discussed spreadsheets  We will get into databases now  Lab 9 will get you involved in using database software (Access)  Project 3 will have you use both spreadsheets and databases 2012-02-29Katherine Deibel, Fluency in Information Technology2

3  Databases are collections of information given a structure  We have done this before:  XHTML describes the layout of info on a page  CSS describes the styling of information  JavaScript describes the computation of info  Spreadsheets describe data organization and flow of calculations  The repeated lesson: Give the computer structure so it can help! 2012-02-29Katherine Deibel, Fluency in Information Technology3

4  Some of us want to compute, but all of us want information …  Most archived information is in tables  Databases enhance many applications  Databases introduce interesting ideas  Still, there is a lot of overlap with what spreadsheets can do 2012-02-29Katherine Deibel, Fluency in Information Technology4

5 Before relational databases, there were only “flat files”  Structural information difficult to describe  All processing of information was “special cased” and required custom programs  Information repeated in multiple places and hard to keep consistent  Change in format of one file meant all related programs had to be changed 2012-02-29Katherine Deibel, Fluency in Information Technology5

6  Invented in 1970 by Ted Codd  Motivation: The adverse impact on development productivity of requiring programmers to navigate along access paths to reach target data [...] was enormous. In addition, it was not possible to make slight changes in the layout in storage without simultaneously having to revise all programs that relied on the previous structure. [...] As a result, far too much manpower was being invested in continual (and avoidable) maintenance of application programs. 2012-02-29Katherine Deibel, Fluency in Information Technology6

7  Metadata  Focusing on the relationships between the data entries  Manipulating data tables through operations on the tables  Separating the physical and logical aspects of the database 2012-02-29Katherine Deibel, Fluency in Information Technology7

8 Data about data about data about… 2012-02-29Katherine Deibel, Fluency in Information Technology8

9  Metadata is  Data about data  The key to making computers more useful  A database is composed of data and its metadata  Metadata was not available to computers in the past 2012-02-29Katherine Deibel, Fluency in Information Technology9

10  Bits and bytes encode the information, but that’s not all  Tags can encode format and structure  Example uses:  word processors  HTML  Oxford English Dictionary 2012-02-29Katherine Deibel, Fluency in Information Technology10

11 byte (baIt). Computers. [Arbitrary, prob. influenced by bit sb. 4 and bite sb.] A group of eight consecutive bits operated on as a unit in a computer. 1964 Blaauw & Brooks in IBM Systems Jrnl. III. 122 An 8-bit unit of information is fundamental to most of the formats [of the System/360]. A consecutive group of n such units constitutes a field of length n. Fixed-length fields of length one, two, four, and eight are termed bytes, halfwords, words, and double words respectively. 1964 IBM Jrnl. Res. & Developm. VIII. 97/1 When a byte of data appears from an I/O device, the CPU is seized, dumped, used and restored. 1967 P. A. Stark Digital Computer Programming xix. 351 The normal operations in fixed point are done on four bytes at a time. 1968 Dataweek 24 Jan. 1/1 Tape reading and writing is at from 34,160 to 192,000 bytes per second. byte baIt. Computers. Arbitrary, prob. influenced by bit n. 4 and bite n. A group of eight consecutive bits operated on as a unit in a computer. 1964 Blaauw &amp. Brooks in IBM Systems Jrnl. III. 122 An 8-bit unit of information is fundamental to most of the formats of the System/360.&es.A consecutive group of n such units constitutes a field of length n.&es.Fixed- length fields of length one, two, four, and eight are termed bytes, halfwords, words, and double words respectively. 1964 IBM Jrnl. Res. &amp. Developm. VIII. 97/1 When a byte of data appears from an I/O device, the CPU is seized, dumped, used and restored. 1967 P. A. Stark Digital Computer Programming xix. 351 The normal operations in fixed point are done on four bytes at a time. 1968 Dataweek 24 Jan. 1/1 Tape reading and writing is at from 34,160 to 192,000 bytes per second. 2012-02-29Katherine Deibel, Fluency in Information Technology11

12  Two most important for us are tags and schemas  Tags  Tags 305,471,002  Schemas  “Schemas,” which are descriptions of tables and the kinds of values they can store 2012-02-29Katherine Deibel, Fluency in Information Technology12

13  The Extensible Markup Language has become the standard way to add metadata to data  Its success is largely driven by Web  Example: Canada 32805041 1.61 5 80.1 2012-02-29Katherine Deibel, Fluency in Information Technology13

14  The best part of XML is that YOU think up the tags  A “self-describing language”  There are no tags to learn!!!  That’s why it is called “extensible”  You are already an expert on XML 2012-02-29Katherine Deibel, Fluency in Information Technology14

15  Tags are like XHTML  …  Must be properly nested  Allowed characters  Alphanumeric and _  No spaces!  Everything must be tagged 2012-02-29Katherine Deibel, Fluency in Information Technology15

16  When we tag in XML, we use tags in different ways  Identity: Say what something is  Affinity: Say which properties go together  Collection: Group like things together Isabela 4588 1707 Fernandina 642 1494 Tower 14 76 2012-02-29Katherine Deibel, Fluency in Information Technology16

17 Not really a fortress… More a specialized furniture store 2012-02-29Katherine Deibel, Fluency in Information Technology17

18  Databases are typically in XML  All relational databases use XML  Not all XML databases are relational  The difference:  Relational databases place further restrictions on the XML 2012-02-29Katherine Deibel, Fluency in Information Technology18

19  General XML approach  Best when the data is not rigidly structured  More of an ad hoc organization  Relational database approach  Data comes with a rigid structure  Happens very frequently  Humans (and the computers we make) really really really like structure 2012-02-29Katherine Deibel, Fluency in Information Technology19

20  A relational database consists of  Multiple tables of data  Descriptions of the relationships between the various tables  Sounds simple… and it kind of is 2012-02-29Katherine Deibel, Fluency in Information Technology20

21  Information is stored in tables  Each table consists of entities of one kind  Each entity has a set of characteristics known as attributes  Tables are tuples of these attributes  Each tuple must have a unique primary key  Relationships among the data are stored  The table structure is called a schema  The table contents are an instance 2012-02-29Katherine Deibel, Fluency in Information Technology21

22  Tables have names, attributes, tuples Instance Schema: IDnumber unique number (key) Lasttext person’s last name Firsttext person’s first name Hiredate first day on job Addrtext street address 2012-02-29Katherine Deibel, Fluency in Information Technology22 Primary Key

23  Databases are comprised of multiple tables  BUT DATA SHOULD NOT BE REPEATED!!  Replicated data can differ in its different locations, e.g. multiple addresses can differ  Inconsistent data is worse than no data  Solution:  Keep a single copy of any data, and  If it is needed in multiple places, associate it with a key, and store key rather than the data 2012-02-29Katherine Deibel, Fluency in Information Technology23

24  When looking for information, a single item or a table of answers is possible  “Who is currently taking FIT100?” Result: Table of students  “Who won the 1940 Best Actor Oscar?” Result: A table containing only a single row  “In what years has the US won the World Cup?” Result: Empty Table  A query to a database produces a table 2012-02-29Katherine Deibel, Fluency in Information Technology24

25 Scalpel… Sponge… Union… Join… 2012-02-29Katherine Deibel, Fluency in Information Technology25

26  There are five primitive operations on tables to create new tables:  Select: pick rows from a table  Project: pick columns from a table  Union: combine two tables w/like columns  Difference: remove one table from another  Product: create “all pairs” from two tables  Another fundamental operation is "Join":  Join: Combine tables based on common fields 2012-02-29Katherine Deibel, Fluency in Information Technology26

27  Select creates a table from the rows of another table meeting a criterion Select from Example On Hire < 1993 2012-02-29Katherine Deibel, Fluency in Information Technology27

28  Project creates a table from the columns of another table Project Last, First From Example 2012-02-29Katherine Deibel, Fluency in Information Technology28

29  Union (written like addition) combines two tables with same attributes PoliticalUnits = States + Provinces 2012-02-29Katherine Deibel, Fluency in Information Technology29

30  Difference (written like subtraction) removes 1 table’s rows from another Eastern = States - WestCoast 2012-02-29Katherine Deibel, Fluency in Information Technology30

31  Product (written like multiplication) combines columns and pairs all rows Colors = Blues x Reds Column Rule: If A has x columns and B has y columns, then A x B has x+y columns Row Rule: If A has m rows and B has n rows, then A x B has m∙n rows 2012-02-29Katherine Deibel, Fluency in Information Technology31

32  To the right is a man who divides database tables. Do you want to be like him?  Seriously though  Division operations do exist  Advanced database topic  Not used in regular practice 2012-02-29Katherine Deibel, Fluency in Information Technology32

33  Join (written like a bow tie) combines rows if a common field matches Homes = States Students 2012-02-29Katherine Deibel, Fluency in Information Technology33

34  The five DB Operations can create any table from a given set of tables  Join is not primitive, but can be built from 5  Join, select and project are used most often  All modern database systems are built on these relational operations  The operations are not usually used directly, but are used indirectly from other languages  SQL database language is one such example 2012-02-29Katherine Deibel, Fluency in Information Technology34

35 Databases are a big topic 2012-02-29Katherine Deibel, Fluency in Information Technology35  Physical versus logical databases  Constructing and designing a database  More on operations and queries  More about XML

36  Like many aspects of computer fluency, understanding databases is about understanding structure  Defining structure  Manipulating structure  Databases are based around the simple notion of tables  More tables are built from more tables using operations 2012-02-29Katherine Deibel, Fluency in Information Technology36


Download ppt "Fluency with Information Technology INFO100 and CSE100 Katherine Deibel 2012-02-29Katherine Deibel, Fluency in Information Technology1."

Similar presentations


Ads by Google