Presentation on theme: "CS 245Notes 31 (1) Insertion/Deletion (2) Buffer Management (3) Comparison of Schemes Other Topics."— Presentation transcript:
CS 245Notes 31 (1) Insertion/Deletion (2) Buffer Management (3) Comparison of Schemes Other Topics
CS 245Notes 32 Block Deletion Rx
CS 245Notes 33 Options: (a)Immediately reclaim space (b)Mark deleted
CS 245Notes 34 Options: (a)Immediately reclaim space (b)Mark deleted –May need chain of deleted records (for re-use) –Need a way to mark: special characters delete field in map
CS 245Notes 35 As usual, many tradeoffs... How expensive is to move valid record to free space for immediate reclaim? How much space is wasted? –e.g., deleted records, delete fields, free space chains,...
CS 245Notes 36 Dangling pointers Concern with deletions R1?
CS 245Notes 37 Solution #1: Do not worry
CS 245Notes 38 E.g., Leave “MARK” in map or old location Solution #2: Tombstones
CS 245Notes 39 E.g., Leave “MARK” in map or old location Solution #2: Tombstones Physical IDs A block This spaceThis space can never re-usedbe re-used
CS 245Notes 310 Logical IDs IDLOC 7788 map Never reuse ID 7788 nor space in map... E.g., Leave “MARK” in map or old location Solution #2: Tombstones
CS 245Notes 311 Easy case: records not in sequence Insert new record at end of file or in deleted slot If records are variable size, not as easy... Insert
CS 245Notes 312 Hard case: records in sequence If free space “close by”, not too bad... Or use overflow idea... Insert
CS 245Notes 313 Interesting problems: How much free space to leave in each block, track, cylinder? How often do I reorganize file + overflow?
CS 245Notes 314 Free space
CS 245Notes 315 DB features needed Policies – LRU bad? Pinned blocks Forced output Double buffering Swizzling Buffer Management in prior notes
CS 245Notes 316 Swizzling Memory Disk Rec A block 1 block 2 block 1
CS 245Notes 317 Swizzling Memory Disk Rec A block 1 Rec A block 2 block 1
CS 245Notes 318 Row vs Column Store So far we assumed that fields of a record are stored contiguously (row store)... Another option is to store “like fields” together (column store)
CS 245Notes 319 Example: Order table has schema : –id, cust, prod, store, price, date, qty Row Store
CS 245Notes 320 Example: Order consists of –id, cust, prod, store, price, date, qty Column Store ids may or may not be stored explicitly
CS 245Notes 321 Row vs Column Store Advantages of Column Store –more compact storage (fields not at byte boundary) –replication/compression –efficient reads on data analytics/mining (OLAP) Advantages of Row Store –writes (multiple fields of one record) more efficient –efficient reads for record access (OLTP)
CS 245Notes 322 Literature : Mike Stonebreaker, Elizabeth O'Neil, Pat O’Neil, Xuedong Chen, et al. " C-Store: A Column-oriented DBMS," VLDB Conference, Commerialized as Vertica In (Boston!); also LucidDB, MonetDB, and others.
CS 245Notes 323 There are 10,000,000 ways to organize my data on disk… Which is right for me? Comparison
CS 245Notes 325 To evaluate a given strategy, compute following parameters: -> space used for expected data -> expected time to - fetch record given key - fetch record with next key - insert record - append record - delete record - update record - read all file - reorganize file