Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 4432lecture #41 CS4432: Database Systems II Lecture #5 Professor Elke A. Rundensteiner.

Similar presentations


Presentation on theme: "CS 4432lecture #41 CS4432: Database Systems II Lecture #5 Professor Elke A. Rundensteiner."— Presentation transcript:

1 CS 4432lecture #41 CS4432: Database Systems II Lecture #5 Professor Elke A. Rundensteiner

2 CS 4432lecture #42 TODAY Lecture on chapter 3 for ~30 min Then introduction to project 1 ~20 min Project-1 team formation later today (watch for it on my.wpi)

3 CS 4432lecture #43 Storage Layout : How to lay out data on disk. ( chapter 3)

4 CS 4432lecture #44 Data Items Records Blocks Files Memory Overview

5 CS 4432lecture #45 Record - Collection of related data items (called FIELDS) E.g.: Employee record: name field, salary field, date-of-hire field,...

6 CS 4432lecture #46 Types of records: Main choices: –FIXED vs VARIABLE FORMAT –FIXED vs VARIABLE LENGTH

7 CS 4432lecture #47 A SCHEMA contains information such as: - # fields (attributes) - type of each field (length) - order of attributes in record - meaning of each field (domain) - constraints (primary key, etc). Fixed format Not associated with each record.

8 CS 4432lecture #48 Example: fixed format and length Employee record (1) E#, 2 byte integer (2) E.name, 10 char. Schema (3) Dept, 2 byte code We can simply concatenate fields. 55 s m i t h 02 83 j o n e s 01 Records

9 CS 4432lecture #49 What : Not all fields are included in the record, and possibly in different orders. Then : Record itself must contain format, i.e., it is “Self Describing”: Variable format

10 CS 4432lecture #410 Why Variable format ? “sparse” records repeating fields evolving formats

11 CS 4432lecture #411 Example: variable format and length 4I524SDROF46 Field name codes could also be strings, i.e. TAGS # Fields Code identifying field as E# Integer type Code for Ename String type Length of str.

12 CS 4432lecture #412 EXAMPLE: variable format record with repeating fields e.g., Employee has one or more children 3E_name: FredChild: SallyChild: Tom Do repeating fields always require variable format and length?

13 CS 4432lecture #413 e.g., a person and her hobbies. MarySailingChess-- Then must allocate maximum number of repeating fields (if not used, set to null)

14 CS 4432lecture #414 Many variants between fixed - variable format: Example1: Include record type in record recordtype record length tells me what to expect (i.e., points to schema) 527....

15 CS 4432lecture #415 Record header - data at beginning that describes record May contain: - pointer to schema (record type) - length of record - time stamp (create time, mod. time) - other stuff (e.g., ROW-ID in Oracle)

16 CS 4432lecture #416 Example2: Variant btw FIXED/VAR format Hybrid format : one part is fixed, other is variable E.g.: All employees have E#, name, dept; and other fields vary. 25SmithToy2retiredHobby:chess # of var fields

17 CS 4432lecture #417 Also, many variations in internal organization of record Just to show one: length of field 3F310F1 5 F212 * * * 33251520F1F2F3 total size offsets 0 1 2 3 4 5 15 20

18 CS 4432lecture #418 Question: We have seen examples for : * Fixed format and length records * Variable format and length records (a) Does fixed format and variable length make sense? (b) Does variable format and fixed length make sense?

19 CS 4432lecture #419 Data Items Records Blocks Files Memory Next:

20 CS 4432lecture #420 Next: placing records into blocks blocks... a file assume fixed length blocks assume a single file (for now)

21 CS 4432lecture #421 (1) separating records (2) spanned vs. unspanned (3) mixed record types – clustering (4) split records (5) sequencing (6) indirection Options for storing records in blocks:

22 CS 4432lecture #422 Block (a) no need to separate - fixed size recs. (b) special marker (c) give record lengths (or offsets) - within each record - in block header (1) Separating records R2R1R3

23 CS 4432lecture #423 Unspanned: records must be within one block block 1 block 2... Spanned block 1 block 2... (2) Spanned vs. Unspanned R1R2 R1 R3R4R5 R2 R3 (a) R3 (b) R6R5R4 R7 (a)

24 CS 4432lecture #424 Unspanned is much simpler, but may waste space… Spanned essential if record size > block size Spanned vs. unspanned:

25 CS 4432lecture #425 Example 10 6 records each of size 2,050 bytes (fixed) block size = 4096 bytes block 1 block 2 2050 bytes wasted 2046 2050 bytes wasted 2046 R1R2 Utiliz = 50% -> ½ of space is wasted

26 CS 4432lecture #426 Mixed - records of different types (e.g. EMPLOYEE, DEPT) allowed in same block e.g., a block (3) Mixed record types EMP e1 DEPT d1 DEPT d2

27 CS 4432lecture #427 Why do we want to mix? Answer: CLUSTERING Records that are frequently accessed together should be in the same block Problems Creates variable length records in block Must avoid duplicates (how to cluster?) Insert/deletes are harder

28 CS 4432lecture #428 Example Clustering Q1: select A#, C_NAME, C_CITY, … from DEPOSIT, CUSTOMER where DEPOSIT.C_NAME = CUSTOMER.C.NAME a block CUSTOMER,NAME=SMITH DEPOSIT,NAME=SMITH

29 CS 4432lecture #429 If Q1 frequent, clustering good But if Q2 frequent Q2: SELECT * FROM CUSTOMER CLUSTERING IS COUNTER PRODUCTIVE

30 CS 4432lecture #430 Compromise: No mixing, but keep related records in same cylinder...

31 CS 4432lecture #431 (1) Separating records (2) Spanned vs. Unspanned (3) Mixed record types - Clustering (4) Split records (5) Sequencing (6) Indirection Recap: Storing records in blocks

32 CS 4432lecture #532 Now on to Project 1 …


Download ppt "CS 4432lecture #41 CS4432: Database Systems II Lecture #5 Professor Elke A. Rundensteiner."

Similar presentations


Ads by Google