Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.

Similar presentations


Presentation on theme: "Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D."— Presentation transcript:

1 Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.

2  What is a Database?  A collection of organized information that persists over a long period of time  Such information (or data) is managed by a DataBase Management System  DBMS:  Software used to create and manage large amounts of data efficiently and securely

3  A typical DBMS is expected to:  Provide a means by which users can create databases and specify their schemas  Give users the ability to query the database (and efficiently return results)  Store huge amounts of data  Support durability and reliability, recovering from failures and errors  Control user access to data (i.e. security)

4  Database schemas abstract elements of the real world; to do so, we use a data model  A data model describes:  Structure of the data  Operations on the data (reads and writes)  Constraints on the data

5  Transactional (and other) databases must meet the ACID test:  Atomicity: all-or-nothing execution  Consistency: relationships between data elements must adhere to defined constraints  Isolation: each transaction appears as if it occurs without other database activity  Durability: the data itself is durable in the sense that the data must never be lost

6  A relation is a two-dimensional data structure that consists of a set of attributes and (zero or more) tuples or rows of data  Each attribute takes only simple values  i.e. strings, numbers, boolean values, dates, etc. attribute1attribute2attribute3attribute4 MarkGoldbergAE 1086 MukkaiKrishnamoorthyLally 30530 SibelAdaliLally 3136 firstnamelastnameofficenuttiness

7  The relation schema consists of:  The name of the relation  The set of attributes  The name (and type) of each attribute  Other constraints  An example: Profs( firstname, lastname, office, nuttiness )

8  A relation contains a set of tuples  Each tuple contains values for all the attributes in the relation schema that are drawn from the domain of that attribute  Example tuples: possibly empty ( 'Mark', 'Goldberg', 'AE 108', 6 ) ( 'Mukkai', 'Krishnamoorthy', 'Lally 305', 30 ) ( 'Sibel', 'Adali', 'Lally 313', 6 ) As a set, tuple order is not significant.

9  A key for a relation is a set of attributes such that no pair of tuples has the same value for the key  Examples:  Social Security Number  RIN (Rensselaer ID Number)  First and last name (would this one work???) Given the key, we can query the relation and expect exactly one result (or zero!). Profs( firstname, lastname, office, nuttiness )

10  In practice, keys are used to improve efficiency of queries using such keys  And note that not all keys provide “uniqueness”  Since relations may have multiple keys, a primary key is selected  The primary key might be a separate (unused?) numeric field What would be the use of this?

11  To store a relation, we can use SQL to create a table in a relational database system  Example attribute (data) types include:  CHAR, VARCHAR, TEXT  BIT, INT, INTEGER, FLOAT, DOUBLE, REAL  DECIMAL, NUMERIC  DATE, DATETIME, TIMESTAMP  BLOB, MEDIUMBLOB, LONGBLOB

12 create table tablename ( attribute1_name attribute1_type, attribute2_name attribute2_type,... attributeN_name attributeN_type, constraints ); might also have table options here other attribute constraints might be included here

13 create table student ( id int, name varchar(255), major char(4), gender char(1), dob date, constraint student_pk primary key (id) ); student_pk is an arbitrary name why did we specify these attribute types?

14  Removing a table from the schema:  Adding a new attribute to a table:  Removing an attribute from a table: drop table tablename; truncate table tablename; what’s the difference? alter table tablename add attributename attributetype; alter table tablename drop attributename;

15  Relational algebra consists of a set of simple operators that can be used to query the database  Each operator takes as input two relations and produces as output a relation  Think of a relation as a set of tuples  The input and output relations all must have the same schema

16  Given two relations R and S that have the same schema, set operators include:  Union: ▪ R  S  { tuples that are in R or S (or both) }  Intersection: ▪ R  S  { tuples that are in both R and S }  Set difference: ▪ R – S  { tuples that are in R but not in S } remember that a set does not contain duplicates

17  The projection of a relation R on attributes A 1, A 2,..., A n is given by:    A 1,...,A n (R) = { t | t is a tuple in R and t only contains values for attributes A 1, A 2,..., A n iff the schema of R contains attributes A 1, A 2,..., A n }  We use projection to remove existing attributes from R (by selecting a subset of them) duplicate tuples are omitted!

18  Find and select all tuples from relation R that satisfy some set of conditions  Forms the basis of querying a database  The selection  C (R) is based on Boolean condition C over attributes of relation R  Example conditions include: ▪ A = e, A > e, A >= e, A e ▪ A 1 = A 2, A 1 <> A 2 ▪ Any combination of conditions using AND, OR, NOT A, A 1, and A 2 are attributes e is a constant or expression

19  Selection selects a subset of tuples in relation R (with the schema unchanged)    C (R) = { t | t is a tuple in R and t satisfies the condition C on relation R }  Selection conditions can only refer to attributes in the given relation R  For conditions spanning multiple relations, we first must combine those relations (i.e. join)

20  Download and install both the Oracle and MySQL database packages noted on the course Web site

21  Design a full schema to store information about celebrities, including:  Basic information  Relationships (e.g. marriages, flings, etc.)  Issues (e.g. drugs, affairs, addictions, etc.)


Download ppt "Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D."

Similar presentations


Ads by Google