Presentation is loading. Please wait.

Presentation is loading. Please wait.

IELM 511: Information System design Introduction Part 1. ISD for well structured data – relational and other DBMS ISD for systems with non-uniformly structured.

Similar presentations


Presentation on theme: "IELM 511: Information System design Introduction Part 1. ISD for well structured data – relational and other DBMS ISD for systems with non-uniformly structured."— Presentation transcript:

1 IELM 511: Information System design Introduction Part 1. ISD for well structured data – relational and other DBMS ISD for systems with non-uniformly structured data Part III: (one out of) Basics of web-based IS (www, web2.0, …) Markup’s, HTML, XML Design tools for Info Sys: UML API’s for mobile apps Security, Cryptography IS product lifecycles Algorithm analysis, P, NP, NPC Info storage (modeling, normalization) Info retrieval (Relational algebra, Calculus, SQL) DB integrated API’s

2 Agenda Relational Algebra Relational Calculus Structured Query Language (SQL) DB API’s

3 Recall our Bank DB design BRANCH( b-name, city, assets) CUSTOMER( cssn, c-name, street, city, banker, banker-type) LOAN( l-no, amount, br-name) PAYMENT( l-no, pay-no, date, amount) EMPLOYEE( e-ssn, e-name, tel, start-date, mgr-ssn) ACCOUNT( ac-no, balance) SACCOUNT( ac-no, int-rate) CACCOUNT( ac-no, od-amt) BORROWS( cust-ssn, loan-num) DEPOSIT( c-ssn, ac-num, access-date) DEPENDENT( emp-ssn, dep-name)

4 Background: Algebra What is an algebra ? Study of systems of mathematical objects and operations defined on the objects Examples of algebras: Integers, with operations: +, -, ×, /, % … Real numbers, with operations: +, -, ×, /, … Vectors, with operations: +, -, , ×, … Booleans, with operations: , , , …

5 Relational Algebra Relational Algebra: objects: instances of relational schemas (namely, tables) operations: , , ×, set-theoretic operations: , , -, ÷ Key concepts: Operator arguments: Arguments of operators are instances of schemas (table) Operation closure: The outcome of the operator is an instance of schema Expressions: A sequence of operations can be written as an expression Operator precedence: The sequence of application of operations in an expression is fixed. Compare these concepts to those in other algebras

6 Relational Algebra: select,   : unary operator, input: one table; output: table Notation: in remainder, we will refer to an instance of a schema as a table LOAN loan_numberamountbranch_name L171000Downtown L232000Redwood L151500Pennyridge L93500Mianus L11900Round Hill L161300Pennyridge  [amount > 1200] (LOAN) loan_numberamountbranch_name L232000Redwood L151500Pennyridge L161300Pennyridge

7 Relational Algebra: select,   conditions of  operator: - Denote the criterion for selection of a given tuple - Must be evaluated one tuple at a time - Must evaluate to ‘true’ or ‘false’ - Output = set of tuples for which  -conditions are ‘true’ LOAN loan_numberamountbranch_name L171000Downtown L232000Redwood L151500Pennyridge L93500Mianus L11900Round Hill L161300Pennyridge  [(amount > 1200)  (branch_name = ‘Pennyridge’)] (LOAN) loan_numberamountbranch_name L151500Pennyridge L161300Pennyridge

8 Relational Algebra: project,  LOAN loan_numberamountbranch_name L171000Downtown L232000Redwood L151500Pennyridge L93500Mianus L11900Round Hill L161300Pennyridge  : unary operator, input: one table; output: table  [list of attributes] (TABLE)  [loan_number, amount] (LOAN) loan_numberamount L171000 L232000 L151500 L93500 L11900 L161300

9 Relational Algebra: project,   [branch_name] (LOAN) Project returns a set of tuples; the number of rows may be smaller that input LOAN loan_numberamountbranch_name L171000Downtown L232000Redwood L151500Pennyridge L93500Mianus L11900Round Hill L161300Pennyridge branch_name Downtown Redwood Pennyridge Mianus Round Hill Example: Find the names of all branches that have given loans

10 Relational Algebra: combining operations LOAN loan_numberamountbranch_name L171000Downtown L232000Redwood L151500Pennyridge L93500Mianus L11900Round Hill L161300Pennyridge branch_name Redwood Pennyridge Example: Find the names of all branches that have given loans larger than 1200  [branch_name] (  [(amount > 1200) ] (LOAN))

11 Relational Algebra: combining operations LOAN loan_numberamountbranch_name L171000Downtown L232000Redwood L151500Pennyridge L93500Mianus L11900Round Hill L161300Pennyridge branch_name Redwood Pennyridge Example: Find the names of all branches that have given loans larger than 1200 X = (  [(amount > 1200) ] (LOAN))Y =  [branch_name] (X) Note: expressions impose a sequence in which operations are perfromed X loan_numberamountbranch_name L232000Redwood L151500Pennyridge L161300Pennyridge Y

12 Relational Algebra: join, × Join is useful when the information required is in two (or more) tables. Tables are sets of tuples, and the join of two tables produces a cartesian product of the two sets Background (set theory): cartesian product, A × B = { (x, y) | x  A, y  B} Example: A = { 1, 2, 3 }, B = { a, s} A × B = { (1, a), (1, s), (2, a), (2, s), (3, a), (3, s) }

13 Relational Algebra: join, × Cartesian product, BORROWS × LOAN BORROWS customerloan_no 111-12-0000L17 222-12-0000L23 333-12-0000L15 444-00-0000L93 666-12-0000L17 111-12-0000L11 999-12-0000L17 777-12-0000L16 LOAN loan_numberamountbranch_name L171000Downtown L232000Redwood L151500Pennyridge L93500Mianus L11900Round Hill L161300Pennyridge 5 columns customerloan_noloan_numberamountbranch_name 111-12-0000L17 1000Downtown 111-12-0000L17L232000Redwood 111-12-0000L17L151500Pennyridge … … 777-12-0000L16 1300Pennyridge 48 rows

14 Relational Algebra: join, × Usually, a cartesian product produces several tuples with un-related information.  -join specifies a  -condition (same as a selection criterion) to restrict the output of a join to meaningful tuples only. Example: Find the loan no, amount and branch name for all customers. BORROWS × [loan_no = loan_number] LOAN customerloan_noloan_numberamountbranch_name 111-12-0000L17 1000Downtown 222-12-0000L23 2000Redwood 333-12-0000L15 1500Pennyridge 444-00-0000L93 500Mianus 666-12-0000L17 1000Downtown 111-12-0000L11 900Round Hill 999-12-0000L17 1000Downtown 777-12-0000L16 1300Pennyridge 5 columns 8 rows [Why ?]

15 Relational Algebra: dot-notation in join, × Two tables being joined may have the same attribute name (possibly denoting two different things). To distinguish the columns in the  -join, the names of attributes use dot-notation C = BORROWS × [BORROWS.loan_no = LOAN.loan_number] LOAN C = BORROWS × [loan_no = loan_number] LOAN The following are all equivalent: A = BORROWS B = LOAN C = A × [A.loan_no = B.loan_number] B

16 Relational Algebra: set theoretic operations,  Since a table is a set of tuples, it is possible to make a union of two tables. BUT: we require closure (union of two tables should be a table).  Union is defined for two tables with identical schemas. Example: Find the names of customers who have either a deposit, or a loan with the bank A =  [customer] (BORROWS)   [c_ssn] (DEPOSIT) RESULT =  [name] (A × [A.customer= CUSTOMER.ssn] CUSTOMER ) name Jones Smith Hayes Curry Turner Williams Adams Johnson Brooks Lindsay

17 Relational Algebra: set theoretic operations,  Other set theoretic operations can be applied with same rules. Example: Find the names of customers who have both, a deposit and a loan with the bank A =  [customer] (BORROWS)   [c_ssn] (DEPOSIT) RESULT =  [name] (A × [A.customer= CUSTOMER.ssn] CUSTOMER ) name Jones Smith Hayes c_ssn 888-12-0000 222-12-0000 333-12-0000 555-00-0000 111-12-0000 000-12-0000 customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 999-12-0000 777-12-0000  customer 111-12-0000 222-12-0000 333-12-0000 = RESULT

18 Relational Algebra: set theoretic operations,  Other set theoretic operations (same rules). Example: Find the names of customers who have a loan but no deposits. A =  [customer] (BORROWS)   [c_ssn] (DEPOSIT) RESULT =  [name] (A × [A.customer= CUSTOMER.ssn] CUSTOMER ) name Johnson Turner Lindsay c_ssn 888-12-0000 222-12-0000 333-12-0000 555-00-0000 111-12-0000 000-12-0000 customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 999-12-0000 777-12-0000  customer 888-12-0000 555-12-0000 000-12-0000 = RESULT

19 Relational Algebra: set theoretic operations, ÷ Set division extends the meaning of integer division, in the sense that it ‘cancels away’ common multiples. It is useful in answering ‘for all’ queries. Example: Do all the loan officers have the same manager ? A solution: Find the ssn of the person who manages all the loan officers. A =  [banker] (  [b_type=LO] (CUSTOMER) ) B =  [mgr_ssn, e_ssn] (EMPLOYEE) RESULT = B ÷ A RESULT banker 333-11-4444 123-45-6789 A mgr_ssne_ssn 321-32-4321111-22-3333 333-11-4444 111-22-3333123-45-6789 321-32-4321555-66-8888 888-99-9999987-65-4321 777-77-7777888-99-9999 777-77-7777321-32-4321 null777-77-7777 B ÷ mgr_ssn 111-22-3333 Note: for this example, we have to specify that the common divisor in B is e_ssn.

20 Relational Algebra: set theoretic operations, ÷ Generic definition of ÷ Attribute restrictions: A ÷ B is defined only for A( R, C) and B( C), where R, C are sets of attributes. Output: The output contains each t i [R] such that  tuples t j [C]  B,  a tuple, t  A in which t[C] = t j [C] and t[R] = t i [R]. r1…rmc1 …ck … c1 …ck common attribute set, C attribute set, R t1 … tn r1…rm … OUTPUT ÷

21 Relational Algebra: concluding remarks RA provides a formal language to get information from the database RA can potentially answer any query, as long as the query pertains to exactly one row of some table derivable using expressions. Limitations of RA: aggregation and summary information Examples: find the average amount of assets in the branches find the total assets of the bank, … RA is procedural, namely, an expression of RA specifies a step by step procedure for computing the result.

22 Relational Calculus (RC) Background: what is a calculus ? RC is based on a formal system in logic, first order predicate calculus (fopc) A formal system has: a set of symbols; rules about how the symbols can be arranged in well formed formulae (wff) a (logical) mechanism to derive if a wff is true/false. additionally, fopc allows wff with ‘variables’ and quantifiers ( ,  ). A query in RC takes the form:{t | P(t) } Meaning: the set of all tuples, t, for which some Proposition, P(t) is true. P is also called a predicate.

23 Relational Calculus (RC) examples 1. Report the loans that exceed $1200: { t | t  LOAN  t[amount] > 1200} 2. Find the names of customers who took a loan from the Pennyridge branch. { t[name] |  s  BORROWS  s[customer] = t[ssn]   u  LOAN  u[loan_number] = s[loan_no]  u[branch_name] = ‘Pennyridge’}

24 Relational Calculus (RC) remarks RC is non-procedural – any way that the predicate P can be evaluated is valid. RC is the formal basis for Structured Query Language (SQL) SQL is the de facto standard language for all RDBMSs In terms of functionality (i.e. the power to get some information from any DB) RA and RC are equivalent). Namely, any query that can be written in RC has an equivalent RA formula, and vice versa. Advantage of RC (over RA): conceptually, it is better to allow the user to define the logic of the query, but leave the procedure for computing it to the program [why ?].

25 Bank tables.. BRANCH branch_namecityassets DowntownBrooklyn9000000 RedwoodPalo Alto2100000 PennyridgeHorseneck1700000 MianusHorseneck400000 Round HillHorseneck8000000 PownalBennington300000 North TownRye3700000 BrightonBrooklyn7100000 EMPLOYEE e_ssne_nametelstart_datemgr_ssn 111-22-3333Jones12345Nov-2005321-32-4321 333-11-4444Smith54321Mar-1998111-22-3333 123-45-6789Lee54321Mar-1998111-22-3333 555-66-8888Turner55555Aug-2002321-32-4321 987-65-4321Jones87621Mar-1995888-99-9999 Chan87654Feb-1980777-77-7777 321-32-4321Adams77777Feb-1990777-77-7777 Black99111Jan-1980null

26 CUSTOMER ssnnamestreetcitybankerb_type 111-12-0000JonesMainHarrison321-32-4321CRM 222-12-0000SmithNorthRye321-32-4321CRM 333-12-0000HayesMainHarrison321-32-4321CRM 444-12-0000CurryNorthRye333-11-4444LO 555-12-0000TurnerPutnamStamford888-99-9999DO 666-12-0000WilliamsNassauPrinceton333-11-4444LO 777-12-0000AdamsSpringPittsfield123-45-6789LO 888-12-0000JohnsonAlmaPalo Alto888-99-9999DO 999-12-0000BrooksSenatorBrooklyn123-45-6789LO 000-12-0000LindsayParkPittsfield888-99-9999DO DEPOSIT c_ssnac_numaccessDate 888-12-0000A101Jan 1, 09 222-12-0000A215Feb 1, 09 333-12-0000A102Feb 28, 09 555-00-0000A305Mar 10, 09 888-12-0000A201Mar 1, 98 111-12-0000A217Mar 1, 09 000-12-0000A101Feb 25, 09 BORROWS customerloan_no 111-12-0000L17 222-12-0000L23 333-12-0000L15 444-00-0000L93 666-12-0000L17 111-12-0000L11 999-12-0000L17 777-12-0000L16 LOAN loan_numberamountbranch_name L171000Downtown L232000Redwood L151500Pennyridge L93500Mianus L11900Round Hill L161300Pennyridge

27 References and Further Reading Silberschatz, Korth, Sudarshan, Database Systems Concepts, McGraw Hill Next: SQL and DB API’s


Download ppt "IELM 511: Information System design Introduction Part 1. ISD for well structured data – relational and other DBMS ISD for systems with non-uniformly structured."

Similar presentations


Ads by Google